Structure of Sentences: Parse trees Shallow parsing identifies phrasal units, the task of identifying the relationship between them is called parsing.
Parse trees indicate how different grammatical units in a sentence are related hierachically. (aslo refer to constituent parse, chart-based )
dependency parsing: directed graph (graph-based)
Backpropagation Through Time Long Short-Term Memory Delete information from the context that is no longer needed: Forget Gate f $$ f_t = \sigma (U_f h_{t-1} + W_f X_t) $$ $$ k_t = c_{t-1} \odot f_t $$ Compute the actual information we need to extract from the previous hidden stat and
More about Graph Neural Network
Algebra presentation of Graphs 1. Adjacency matrix $$ A_{i j}= \begin{cases} 1 & \text { if }\lbrace v_{i}, v_{j}\rbrace \in E \text { and } i \neq j \cr 0 & \text { otherwise } \end{cases} $$
2. Degree matrix: D is a diagonal matrix, where $$ D_{ii} = d(v_i) $$
Introduction of Graph Neural Networks
Data Eculidean Structure Data: image, video, voice … easy to find adjacent neighbors easy to define distance Non-Eculidean data: Graph, Manifold hard to define adjacent neighbors or the numbers of adjacent nodes varies. means hard to define distance, convolution … Embed (project) Non-Eculidean Data into Eculidean Space using geometric deep learning
A Cuda/C++ starter cheatsheet Harward and software Thread block and grid are logical threads, make programming easy. In hardware, each GPU made of lots of streaming multiprocessor(hardware), which have lots of threads. Concepts kernel: the code (function) run on GPU one kernel, only have one grid, grid have blocks, block
Learn C++11 thread library. Code snippets from Concurrent Programming with C++11
Process vs. Threads Usage Summary A short summary of thread library in STL
thread and async 1 2 3 4 5 6 7 8 /* thread */ std::thread t1(factorial, 6); // create a new thread std::this_thread::sleep_for(chrono::milliseconds(3)); chrono::steady_clock::time_point tp = chrono::steady_clock::now() + chrono::microseconds(4); std::this_thread::sleep_until(tp); /* async() */ std::future<int> fu = async(factorial, 6); // create a new thread mutex 1 2 3 4 5 6 7 /* Mutex */ std::mutex mu; std::lock_guard<mutex> locker(mu); std::unique_lock<mutex> ulocker(mu); ulocker.
one-way ANOVA from scratch Calculate the Sum of Squares Total (SST): $$ SS_{total} = \sum_{j=1}^k \sum_{i=1}^l (X_{ij} - \bar{X})^2 $$
Calculate the Sum of Squares Within Groups (SSW): $$ SS_{within} = \sum_{j=1}^k \sum_{i=1}^l (X_{ij} - \bar{X_j})^2 $$
Calculate the Sum of Squares Between Groups (SSB): $$ SS_{between} = \sum_{j=1}^k n_j ( \bar X_{j} - \bar{X}) ^2 $$
Censoring Censoring
Surivial without Censoring Surivial with Censoring Kaplan Meier Curve More individual in each group, better sepration of the group, better p-value
Takes censoring into account Estimates probabilitu of “survival” on a given day Conditional probability of surviving on a given day: $$ \frac {N_{ \text{“alive” day before}} - N_{ \text{“dying” nextday}}} { \text{“alive” day before}} $$
NLP Basics for the newbies like me
Languwage model Models that assigns probabilities to sequences of words are called languwage models.
Count-based Representation 1. one-hot representation 2. BoW: Bag of words Blow describes the occurrence of words within a document. including
A Vocabulary of known words A measure of the presence of known words, e.
样本量、效应量、显著水平和统计功效的统计原理和计算 效应量通常用三种方式来衡量:标准均差(standa