H-Index Problem

Question

H-Index Problem

asked Sep 14, 2020 in Coding Questions by Vaibhav98 Goeduhub's Expert (2.3k points)

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher's h-index.

According to the definition of h-index on Wikipedia: "A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each."

Example:

Input: citations = [3,0,6,1,5]
Output: 3 
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had 
             received 3, 0, 6, 1, 5 citations respectively. 
             Since the researcher has 3 papers with at least 3 citations each and the remaining 
             two with no more than 3 citations each, her h-index is 3.

Note: If there are several possible values for h, the maximum one is taken as the h-index.

1 Answer

answered Sep 15, 2020 by Vaibhav98 Goeduhub's Expert (2.3k points)

Best answer

The main trick is to count for each possible number of citations, how many times we see this number in our citations. Note, that if number of citations is more than total number of papers N, we can reduce this numer to N and nothing will change. Let me explain my solutoin given test example [3,0,6,1,5].

We create array, which I called buckets = [1, 1, 0, 1, 0, 2], where buckets[i] is number of papers with i citations if i < N and bucket[N] is number of papers with >=N citations.

Now, we create accum array, where accum[0] is number of papers with >=1 citation, accum[1] is number of papers with >=2 citations and so on. When we evaluate it for our example we can see, that it is equal to accum = [4,3,3,2,2]. Note, that we start with 1 citation, not with zero, that is why we need to use accum[1:] in our code.

Finally, we need to go through this array and find the bigest number i, for which accum[i] >= i + 1 and return i + 1, in our example it is equal to 3.

Complexity

Complexity is O(n), though we traverse our data 4 times, and it is not the optimal solution. We can evaluate cumulative sums in place, and compare them in place with i + 1 and keep the index of maximum reached so far, and interrupt when inequality accum[i] >= i + 1 does not hold anymore. Howerer I like cumulative sums and code is more clean in my way.

class Solution:

def hIndex(self, citations):

N = len(citations)

buckets = [0] * (N + 1)

for elem in citations:

buckets[min(elem, N)] += 1

accum = list(accumulate(buckets[1:][::-1]))[::-1]

compar = [accum[i] >= i + 1 for i in range(N)]

return (compar + [0]).index(0)

eg : ans = Solution()

ans.hIndex([3,0,6,1,5])

Output -

3

SUMMER TRAINING	Free Tutorials	Go To Your University	Placement Preparation
*Join our Telegram Channel To take free Online Courses*

H-Index Problem

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Our Mentors(For AI-ML)

Related questions