1338. Reduce Array Size to The Half
You are given an integer array arr. You can choose a set of integers and remove all the occurrences of these integers in the array.
Return the minimum size of the set so that at least half of the integers of the array are removed.
Example 1:
Input: arr = [3,3,3,3,5,5,5,2,2,7]
Output: 2
Explanation: Choosing {3,7} will make the new array [5,5,5,2,2] which has size 5 (i.e equal to half of the size of the old array).
Possible sets of size 2 are {3,5},{3,2},{5,2}.
Choosing set {2,7} is not possible as it will make the new array [3,3,3,3,5,5,5] which has a size greater than half of the size of the old array.
Example 2:
Input: arr = [7,7,7,7,7,7]
Output: 1
Explanation: The only possible set you can choose is {7}. This will make the new array empty.
Reduce Array Size to The Half - Detailed Explanation
In this problem, we are given an array of integers and asked to perform operations that remove elements from it, until the array becomes no more than half of its original size.
The catch? Each operation allows us to pick one number and remove all occurrences of it at once. The ultimate goal is to minimize the number of such operations. So, the fewer numbers we choose, the better.
✨ Key Insight
Not all numbers are equal in this scenario. Some numbers appear many times, while others barely show up. So logically, we want to remove numbers that occur the most. This way, we eliminate more elements per operation.
Example
Suppose we have this array:
[1, 1, 2, 2, 2, 3, 3, 4]
Here’s the count for each number:
- 1 appears 2 times
- 2 appears 3 times
- 3 appears 2 times
- 4 appears 1 time
To reduce the array to half (original length is 8, so we need ≤ 4 elements), we can remove the number with the highest frequency first (which is 2). That removes 3 elements in one go.
After that, we just need to remove one more number to make the total removed ≥ 4. We can choose either 1 or 3.
Approach 1: Sorting
🔍 Intuition
First, we count how many times each number occurs. To make this easier, we sort the array so that identical elements are next to each other. This helps in grouping them together.
🚀 Steps
- Sort the array.
- Count the frequency of each number.
- Sort the frequencies in descending order.
- Keep removing the most frequent numbers until we have removed at least half of the array.
💡 Algorithm
sort(arr) counts = frequency of each number sort(counts, descending) removed = 0 operations = 0 for count in counts: removed += count operations += 1 if removed >= n / 2: break return operations
class Solution {
public int minSetSize(int[] arr) {
Map<Integer, Integer> hmap = new HashMap<>();
for (int elem : arr) {
hmap.put(elem, hmap.getOrDefault(elem, 0) + 1);
}
PriorityQueue<Integer> q = new PriorityQueue<>((a, b) -> (b - a));
for (int val : hmap.values()) {
q.offer(val);
}
int sum = 0;
int count = 0;
while (sum < arr.length / 2) {
sum += q.poll();
count++;
}
return count;
}
}
⏱ Time Complexity
O(n log n): Sorting takes O(n log n), counting and iterating are O(n).
🧠 Space Complexity
O(n): We store the counts of each number, and possibly a copy for sorting.
Approach 2: Hashing / Counting
🔍 Intuition
Instead of sorting the array first, we can go straight to counting frequencies using a hashmap or dictionary. This way, we save time and keep things efficient.
🚀 Steps
- Create a hashmap where key = number and value = its count.
- Extract all frequency values.
- Sort them in descending order.
- Keep removing until removed elements ≥ n/2.
💡 Sample Code (Pseudocode)
map = new HashMap() for num in arr: map[num] += 1 counts = values of map sort(counts, descending) removed = 0 steps = 0 for count in counts: removed += count steps += 1 if removed >= n / 2: break return steps
⏱ Time Complexity
O(n log n) in worst case, due to sorting frequencies.
🧠 Space Complexity
O(n): HashMap and frequency list use extra space.
✨ Advantage
This is often faster in practice than the sorting-based approach, especially when there are fewer unique numbers.
Approach 3: Hashing + Bucket Sort
🔍 Intuition
The previous approach sorts the counts array which can take O(n log n). But do we really need a comparison-based sort? Nope! Since the maximum count of any number cannot exceed n, we can use a bucket sort which runs in linear time.
🚀 Steps
- Count frequencies using a HashMap.
- Initialize a bucket array of size n+1.
- For each frequency value, add +1 at index bucket[freq].
- Start from highest bucket index (most frequent), removing elements until we reach n/2.
💡 Pseudocode
map = new HashMap() for num in arr: map[num] += 1 bucket = new int[n+1] for count in map.values(): bucket[count] += 1 removed = 0 operations = 0 for i = n to 1: while bucket[i] > 0 and removed < n/2: removed += i operations += 1 bucket[i] -= 1 return operations
⏱ Time Complexity
O(n): Both frequency counting and bucket sorting take linear time.
🧠 Space Complexity
O(n): HashMap + bucket array.
💥 Performance Boost
This method gives you the best of both worlds: fast execution and fewer lines of code if implemented well.
✨ Summary & Recommendations
- Approach 1 is simple and intuitive but not the most efficient.
- Approach 2 is a practical improvement using hashing — clean and fast.
- Approach 3 is best for optimal time performance, especially in large inputs.
For interviews or time-sensitive platforms like LeetCode, **Approach 2** is a great balance of simplicity and performance. But if you’re going the extra mile for performance, **Approach 3** is the way to go!