NumPy: Averages of Data Sets

What are the meanings of the various types of averages in datasets?

  • Mean == the "centre" ("center") of a dataset. 
    • If you have an array: array_1 = np.array([1,2,3,4,5]), the mean of this array would be 3, because 1 + 2 +3 + 4 + 5 = 15, 15 / 5(numbers in array) = 3. Thus, 3 would be the average or the mean of this particular list.
    • The mean is affected by outliers
  • Median == the "middle" of a dataset. 
    • If you have a list [1, 1, 4, 7, 8, 9,  9], then 7 would be the median of the list, as it is literally halfway between the minimum value and the maximum value. 
    • If you have a list whose length is an even number, say [1, 2,  2,  3, 4,  5, 5, 7] (8 numbers), then the median is the half-way point between the two middle numbers (in this case 3 & 4), so the median of the list above would be 3.5. 
    • Of course, we're likely to be dealing with very large lists and arrays, so working out the middle numbers ourselves would become a very tedious task.  We can overcome this by using the np.median function. 
    • The median is not affected by outliers

Finding Percentages: 

You can use numpy in conjunction with the mean function to work out percentages from a given dataset: np.mean. You can do so using logical operators.  For example, if you have an np.array_example = [15, 18, 9, 5, 4, 21, 10, 16] and you wanted to find out the percentage of elements greater than 10, you could do so by using: 
>>>np.mean(array_example > 10)
0.5
 
The above result is 0.5 or 50%.

Why does this work?  Well, the code is using a logical operator to iterate through the array data.  Where an element is greater than 10 it is equal to 1 (or True).  Where it is equal to, or not greater than 10, it is equal to 0 (or False). The mean function then takes the number of results equal to 1 and divides them by the number of elements in the list (in this case the answer would be 4 / 8 = 0.5). In other words, 50% of elements in the array are equal to True, which in this case is the same as saying 50% of elements in the array are greater than 10. 





No comments:

Post a Comment

Web Development: Organizing Files and Folders

When you begin to build your website, it's a very clever idea to organize  your files and folders efficiently. You should have: A ...