Data Analysis with Python: Calculating Mean and Median

I'm working on a data analysis project in Python and need to calculate both the mean and median of a dataset. I understand the basic concepts, but I'm looking for a Python code example that demonstrates how to do this efficiently.

Let's say I have a list of numbers:

Code:
data = [12, 45, 67, 23, 41, 89, 34, 54, 21]

I want to calculate both the mean and median of these numbers. Could you provide a Python code snippet that accomplishes this? Additionally, it would be helpful if you could explain any libraries or functions used in the code.

Thank you for your assistance in calculating these basic statistics for my data analysis project!
 
You can use numpy for those calculations on datasets.

Code:
import numpy

data = [12, 45, 67, 23, 41, 89, 34, 54, 21]

mean = numpy.mean(data)
median = numpy.median(data)

print(mean)
print(median)
 
using Pandas .mean() and .median() methods would be easiest

So if you have a DataFrame 'df' and a column called 'price' (for example), code would be:

df['price'].mean()
df['price'].median()

If you want to do the mean/median calculation on a list (like you have above), then use numpy...

import numpy as np

data = [12, 45, 67, 23, 41, 89, 34, 54, 21]
np.mean(data)
np.median(data)

Hope that helps
 
gpt-4 also likes Drawdown Addict's use of data function

VyAv7L.png
 
import statistics

data = [1,2,3,4,5,6]

statistics.mean(data)
statistics.median(data)

statistics is supported since version 3.4
No need to use numpy or pandas as otherwise suggested. Downside of this approach is that your data gets first sorted no matter whether the data is already sorted or not. - > possible overhead. But so does numpy.
 
Last edited:
I would not call this easiest when there is an alternative, equally fast, solution that is already built in. See the post above. Yours requires an external library that must be installed. Most often it is as pandas and numpy are very useful libraries but still, guess I am a stickler for details. I actually quit using pandas, migrated everything to polars.

using Pandas .mean() and .median() methods would be easiest

So if you have a DataFrame 'df' and a column called 'price' (for example), code would be:

df['price'].mean()
df['price'].median()

If you want to do the mean/median calculation on a list (like you have above), then use numpy...

import numpy as np

data = [12, 45, 67, 23, 41, 89, 34, 54, 21]
np.mean(data)
np.median(data)

Hope that helps
 
Terrible user name and I hope you are aware of the connotations....not that theres any wrong with that.

If you are aware,then kudos on keeping it real! hahaha
 
I would not call this easiest when there is an alternative, equally fast, solution that is already built in. See the post above. Yours requires an external library that must be installed. Most often it is as pandas and numpy are very useful libraries but still, guess I am a stickler for details. I actually quit using pandas, migrated everything to polars.

well the entire quant community these days uses pandas and numpy as the primary data manipulation/analysis tooling so…
 
... so, what I said... Still not the easiest or simplest way unless your data is already contained in a numpy array or pandas dataframe.

well the entire quant community these days uses pandas and numpy as the primary data manipulation/analysis tooling so…
 
Back
Top