Calculate item frequency in an Numpy array
If we want to calculate the item frequency in a list, it is quite simple:
from collections import Counter
tt = u'教室 少 , 设施 陈旧 , 与 其他 早 教 中心 比 , 硬件 确实 差 了 些 , 但 收费 挺 高 , 性价比 低 , 不 会 选择 。'
t1 = tt.split(' ')
Counter(tt)
If we want to do the same thing with a Numpy array, it is slightly different:
import numpy as np
tn = np.array(t1)
np.unique(tn, return_counts=True)
Although Scipy provides a similar solution, if is not as fast as unique
in Numpy, the comp can be found at this Stackoverflow post.
from scipy.stats import itemfreq
itemfreq(tn)
Bonus:
numpy.unique
has two other optional parameters: return_index
and return_inverse
:
np.unique(tn, return_index=True)
np.unique(tn, return_inverse=True)
Conditional expression (ternary operator)
In python, a simple if ..., else ...
statement can construct a ternary operator:
if x > 3:
print x
else:
print 0
print x if x > 3 else 0
In Numpy, such a ternary operator can be applied to arrays:
arr = randn(4, 4)
np.where(arr > 0, 2, -2)
In this example, a random 4 by 4 matrix was constructed: with np.where, if the element in the original matrix is bigger than 0, it will be converted to 2; if smaller than 0, it will be converted to -2.
Another example:
xarr = np.array([1.1, 1.2, 1.3, 1.4, 1.5])
yarr = np.array([2.1, 2.2, 2.3, 2.4, 2.5])
cond = np.array([True, False, True, True, False])
np.where(cond, xarr, yarr)