Graph Convolutional Network and Graph Attention
Why deep graph encoder ? Limitations of Shallow Encoders (e.g. node2vec)
$O( | V | )$ parameters are needed: No sharing of parameters between nodes Every node has its own unique embedding Inherently “transductive”: Can not generate embeddings for nodes that are not seen during training Do not incorporate node features Many graphs have features that we can and should leverage Graph Convolutional Network Could get embedding for unseen nodes!
Problems: Given a network with labels on some nodes, how do we assign labels to all other nodes in the network?
classification label of an object $O$ in network may depend on:
Features of $O$ Labels of the objects in $O$'s neighborhood Features of objects in $O$'s neigborhood Collective classification models Reational clasifiers Iterative classifications Loopy belief propagation Intuition Simultaneous classification of interlinked nodes using correlations
Node Embedings are learnt in the same way as word2vec (skip-gram model)
However, graphs could be (un)directed, (un)weighted, (a)cyclic and are basically much more complex than the strucure of a sequence…
So how do we generate “corpus” from a graph ?
Random walk on the graph Given a graph and a starting point, we select a neighbor of it at random; then we select a neigbor of this point at random, and move to it, etc.
Word2Vec
CBOW Continuous Bag of Words Model (CBOW)
When trainning, use N-gram language model. That’s for a target word, select $m$ (window) words before and after.
Model
one-hot encoding get $2m$ vectors: $$X = (x^{c-m}, \cdots, x^{c-1}, x^{c+1}, \cdots, x^{c+m})$$
Embeding Vector $\mathcal{V} \in R^{n \times \mathcal{V}}$,
概率图模型(probabilistic graphical model, PGM),是一种学习任务的框架描述,它将学习任务归结为计算
Config IGV on the server.
I have to share the inteactive results with my colleague. But I don’t like to install UCSC genomebrower in local. Instead, a light-weight one is what I need.
1. Installation Install nodejs if you have conda, just
1 conda install -c conda-forge nodejs build igv-webapp 1 2 3 4 git clone https://github.
典型相关分析(CCA) ,一种常用降维算法,也可以用于多个线性空间相关性计算。比如同一对象的多模态数据
A biologist like me might have never had a numerical computing training. I don’t even known what a complex number really means. Here are some useful basics to keep in mind.
Complex number complex number $a+bi$ lives in a 2d complex plane, including
real axis: $a$ imagnary axis: $i$ orthognal to real axis $i \rightarrow 90 \degree \text{rotation}$ 2 ways of representation $z = a + bi$ $z = r \cos(\phi) + r \sin(\phi) i = r e^{i \phi}$ 3 Facts about Multiplication $z \cdot 1 = z$ $z \cdot i = \operatorname{Rot90}(z)$ e.
The best part of snakemake is allowed you to run your pipeline on HPC automatically. It save you a lot of time.
How to run snakemake on HPC there are two ways to configure
use --cluster: works on different HPC system, e.g. slurm, SGE. assign resource in params directive explicitly.
A biologist’s way to learn Fourier transform Visual intuition in 3D This is an awesome introduction Fourier Series Discrete Fourier transform (DFT) A Fourier series is a periodic function composed of harmonically related sinusoids, combined by a weighted summation. 周期性函数可以变换为正