Tsfresh来的灵感
今天在看一篇论文时, 一个引用次数不多的作者提到了tsfresh,本着好奇心,我去查阅了这个东西是个啥。 然后发现了一个非常不新颖,但是却说的很透的一段话。
Data Scientists often spend most of their time either cleaning data or building features. While we cannot change the first thing, the second can be automated. TSFRESH frees your time spent on building features by extracting them automatically. Hence, you have more time to study the newest deep learning paper, read hacker news or build better models.
以上便是。 其实不管一个人写的东西好不好,他只要是出版出来了,肯定有他厉害的地方。 如果能够有一双发现闪光点的眼睛,那就会一直学到有意思的东西。 这篇产自2018年,目前2020年仍然零引用的文章,就有这样的闪光点。
inbalanced data vs balanced data SMOTE: oversampling technique –> handle inbalanced datasets:the minority class is oversampled by creating more samples using interpolation between the neighbors.
LSTM(long short-term memory):time-teries Data Classification(multvariate classification )
CNN(conventional neural network): video classification
statistical approach to time-series classification: tackle in blanced data data amount problem: SMOTE approach: SVM (Support Vector Machine); Naive Bayesian(NB)(tackle high dimensionality of the data)(training Naive Bayesian algorithms with different iniform priority(say, 0.33, 0.34))
in the end, can compare inbalanced data and balanced data, and compare result. using accuracy to compare different result under sample sample(say, all under balanced datasets)
EOIs: events of intestest
PCA(Principal Component Analysis) allow researcher to decorrelate the data by removing redundant features (can compre result using different principal components)