不学无术

自然语言|机器学习 ::词向量

关于词向量工作原理的理解 http://blog.csdn.net/itplus/article/details/12782781
《机器学习实践》训练算法：从词向量计算概率 http://www.ituring.com.cn/article/32340

Tags NLP, 机器学习

不学无术

机器学习资源

本文汇编了一些机器学习领域的框架、库以及软件（按编程语言排序）。
C++
计算机视觉

CCV —基于C语言/提供缓存/核心的机器视觉库，新颖的机器视觉库
OpenCV—它提供C++， C, Python, Java 以及 MATLAB 接口，并支持 Windows, Linux， Android and Mac OS 操作系统。

通用机器学习

MLPack
DLib
ecogg
shark

Closure
通用机器学习

Closure Toolbox—Clojure 语言库与工具的分类目录

Go
自然语言处理

go-porterstemmer—一个 Porter 词干提取算法的原生 Go 语言净室实现
paicehusk—Paice/Husk 词干提取算法的 Go 语言实现
snowball—Go 语言版的 Snowball 词干提取器

通用机器学习

Go Learn— Go 语言机器学习库
go-pr —Go 语言机器学习包.
bayesian—Go 语言朴素贝叶斯分类库。
go-galib—Go 语言遗传算法库。

数据分析/数据可视化

go-graph—Go 语言图形库。
SVGo—Go 语言的 SVG 生成库。

Java
自然语言处理

CoreNLP—斯坦福大学的 CoreNLP 提供一系列的自然语言处理工具，输入原始英语文本，可以给出单词的基本形式（下面 Stanford 开头的几个工具都包含其中）。
Stanford Parser—一个自然语言解析器。
Stanford POS Tagger —一个词性分类器。
Stanford Name Entity Recognizer—Java 实现的名称识别器
Stanford Word Segmenter—分词器，很多 NLP 工作中都要用到的标准预处理步骤。
Tregex， Tsurgeon and Semgrex —用来在树状数据结构中进行模式匹配，基于树关系以及节点匹配的正则表达式（名字是“tree regular expressions”的缩写）。
Stanford Phrasal:最新的基于统计短语的机器翻译系统， java 编写
Stanford Tokens Regex—用以定义文本模式的框架。
Stanford Temporal Tagger—SUTime 是一个识别并标准化时间表达式的库。
Stanford SPIED—在种子集上使用模式，以迭代方式从无标签文本中学习字符实体
Stanford Topic Modeling Toolbox —为社会科学家及其他希望分析数据集的人员提供的主题建模工具。
Twitter Text Java—Java 实现的推特文本处理库
MALLET -—基于 Java 的统计自然语言处理、文档分类、聚类、主题建模、信息提取以及其他机器学习文本应用包。
OpenNLP—处理自然语言文本的机器学习工具包。
LingPipe —使用计算机语言学处理文本的工具包。

通用机器学习

MLlib in Apache Spark—Spark 中的分布式机器学习程序库
Mahout —分布式的机器学习库
Stanford Classifier —斯坦福大学的分类器
Weka—Weka 是数据挖掘方面的机器学习算法集。
ORYX—提供一个简单的大规模实时机器学习/预测分析基础架构。

数据分析/数据可视化

Hadoop—大数据分析平台
Spark—快速通用的大规模数据处理引擎。
Impala —为 Hadoop 实现实时查询

Javascript
自然语言处理

Twitter-text-js —JavaScript 实现的推特文本处理库
NLP.js —javascript 及 coffeescript 编写的 NLP 工具
natural—Node 下的通用 NLP 工具
Knwl.js—JS 编写的自然语言处理器

数据分析/数据可视化

D3.js
High Charts
NVD3.js
dc.js
chartjs
dimple
amCharts

通用机器学习

Convnet.js—训练深度学习模型的 JavaScript 库。
Clustering.js—用 JavaScript 实现的聚类算法，供 Node.js 及浏览器使用。
Decision Trees—Node.js 实现的决策树，使用 ID3 算法。
Node-fann —Node.js 下的快速人工神经网络库。
Kmeans.js—k-means 算法的简单 Javascript 实现，供 Node.js 及浏览器使用。
LDA.js —供 Node.js 用的 LDA 主题建模工具。
Learning.js—逻辑回归/c4.5 决策树的 JavaScript 实现
Machine Learning—Node.js 的机器学习库。
Node-SVM—Node.js 的支持向量机
Brain —JavaScript 实现的神经网络
Bayesian-Bandit —贝叶斯强盗算法的实现，供 Node.js 及浏览器使用。

Julia
通用机器学习

PGM—Julia 实现的概率图模型框架。
DA—Julia 实现的正则化判别分析包。
Regression—回归分析算法包（如线性回归和逻辑回归）。
Local Regression —局部回归，非常平滑！
Naive Bayes —朴素贝叶斯的简单 Julia 实现
Mixed Models —（统计）混合效应模型的 Julia 包
Simple MCMC —Julia 实现的基本 mcmc 采样器
Distance—Julia 实现的距离评估模块
Decision Tree —决策树分类器及回归分析器
Neural —Julia 实现的神经网络
MCMC —Julia 下的 MCMC 工具
GLM —Julia 写的广义线性模型包
Online Learning
GLMNet —GMLNet 的 Julia 包装版，适合套索/弹性网模型。
Clustering—数据聚类的基本函数：k-means, dp-means 等。
SVM—Julia 下的支持向量机。
Kernal Density—Julia 下的核密度估计器
Dimensionality Reduction—降维算法
NMF —Julia 下的非负矩阵分解包
ANN—Julia 实现的神经网络

自然语言处理

Topic Models —Julia 下的主题建模
Text Analysis—Julia 下的文本分析包

数据分析/数据可视化

Graph Layout —纯 Julia 实现的图布局算法。
Data Frames Meta —DataFrames 的元编程工具。
Julia Data—处理表格数据的 Julia 库
Data Read—从 Stata、SAS、SPSS 读取文件
Hypothesis Tests—Julia 中的假设检验包
Gladfly —Julia 编写的灵巧的统计绘图系统。
Stats—Julia 编写的统计测试函数包
RDataSets —读取R语言中众多可用的数据集的 Julia 函数包。
DataFrames —处理表格数据的 Julia 库。
Distributions—概率分布及相关函数的 Julia 包。
Data Arrays —元素值可以为空的数据结构。
Time Series—Julia 的时间序列数据工具包。
Sampling—Julia 的基本采样算法包

杂项/演示文稿

DSP —数字信号处理
JuliaCon Presentations—Julia 大会上的演示文稿
SignalProcessing—Julia 的信号处理工具
Images—Julia 的图片库

Lua
通用机器学习

Torch7
cephes —Cephes 数学函数库，包装成 Torch 可用形式。提供并包装了超过 180 个特殊的数学函数，由 Stephen L. Moshier 开发，是 SciPy 的核心，应用于很多场合。
graph —供 Torch 使用的图形包。
randomkit—从 Numpy 提取的随机数生成包，包装成 Torch 可用形式。
signal —Torch-7 可用的信号处理工具包，可进行 FFT, DCT, Hilbert, cepstrums, stft 等变换。
nn —Torch 可用的神经网络包。
nngraph —为 nn 库提供图形计算能力。
nnx—一个不稳定实验性的包，扩展 Torch 内置的 nn 库。
optim—Torch 可用的优化算法库，包括 SGD, Adagrad, 共轭梯度算法， LBFGS, RProp 等算法。
unsup—Torch 下的非监督学习包。提供的模块与 nn (LinearPsd, ConvPsd, AutoEncoder, …)及独立算法 (k-means, PCA)等兼容。
manifold—操作流形的包。
svm—Torch 的支持向量机库。
lbfgs—将 liblbfgs 包装为 FFI 接口。
vowpalwabbit —老版的 vowpalwabbit 对 torch 的接口。
OpenGM—OpenGM 是 C++ 编写的图形建模及推断库，该 binding 可以用 Lua 以简单的方式描述图形，然后用 OpenGM 优化。
sphagetti —MichaelMathieu 为 torch7 编写的稀疏线性模块。
LuaSHKit —将局部敏感哈希库 SHKit 包装成 lua 可用形式。
kernel smoothing —KNN、核权平均以及局部线性回归平滑器
cutorch—torch 的 CUDA 后端实现
cunn —torch 的 CUDA 神经网络实现。
imgraph—torch 的图像/图形库，提供从图像创建图形、分割、建立树、又转化回图像的例程
videograph—torch 的视频/图形库，提供从视频创建图形、分割、建立树、又转化回视频的例程
saliency —积分图像的代码和工具，用来从快速积分直方图中寻找兴趣点。
stitch —使用 hugin 拼合图像并将其生成视频序列。
sfm—运动场景束调整/结构包
fex —torch 的特征提取包，提供 SIFT 和 dSIFT 模块。
OverFeat—当前最高水准的通用密度特征提取器。
Numeric Lua
Lunatic Python
SciLua
Lua – Numerical Algorithms
Lunum

演示及脚本

Core torch7 demos repository.核心 torch7 演示程序库
线性回归、逻辑回归
人脸检测（训练和检测是独立的演示）
基于 mst 的断词器
train-a-digit-classifier
train-autoencoder
optical flow demo
train-on-housenumbers
train-on-cifar
tracking with deep nets
kinect demo
滤波可视化
saliency-networks
Training a Convnet for the Galaxy-Zoo Kaggle challenge (CUDA demo)
Music Tagging—torch7 下的音乐标签脚本
torch-datasets 读取几个流行的数据集的脚本，包括：
BSR 500
CIFAR-10
COIL
Street View House Numbers
MNIST
NORB
Atari2600 —在 Arcade Learning Environment 模拟器中用静态帧生成数据集的脚本。

Matlab
计算机视觉

Contourlets —实现轮廓波变换及其使用函数的 MATLAB 源代码
Shearlets—剪切波变换的 MATLAB 源码
Curvelets—Curvelet 变换的 MATLAB 源码（Curvelet 变换是对小波变换向更高维的推广，用来在不同尺度角度表示图像。）
Bandlets—Bandlets 变换的 MATLAB 源码

自然语言处理

NLP —一个 Matlab 的 NLP 库

通用机器学习

Training a deep autoencoder or a classifier on MNIST digits—在 MNIST 字符数据集上训练一个深度的 autoencoder 或分类器[深度学习]。
t-Distributed Stochastic Neighbor Embedding —获奖的降维技术，特别适合于高维数据集的可视化
Spider—Matlab 机器学习的完整面向对象环境。
LibSVM —支持向量机程序库
LibLinear —大型线性分类程序库
Machine Learning Module —M. A .Girolami 教授的机器学习课程，包括 PDF，讲义及代码。
Caffe—考虑了代码清洁、可读性及速度的深度学习框架
Pattern Recognition Toolbox —Matlab 中的模式识别工具包，完全面向对象

数据分析/数据可视化

matlab_gbl—处理图像的 Matlab 包
gamic—图像算法纯 Matlab 高效实现，对 MatlabBGL 的 mex 函数是个补充。

.NET
计算机视觉

OpenCVDotNet —包装器，使 .NET 程序能使用 OpenCV 代码
Emgu CV—跨平台的包装器，能在 Windows, Linus, Mac OS X, iOS, 和 Android 上编译。

自然语言处理

Stanford.NLP for .NET —斯坦福大学 NLP 包在 .NET 上的完全移植，还可作为 NuGet 包进行预编译。

通用机器学习

Accord.MachineLearning —支持向量机、决策树、朴素贝叶斯模型、K-means、高斯混合模型和机器学习应用的通用算法，例如：随机抽样一致性算法、交叉验证、网格搜索。这个包是 Accord.NET 框架的一部分。
Vulpes—F#语言实现的 Deep belief 和深度学习包，它在 Alea.cuBase 下利用 CUDA GPU 来执行。
Encog —先进的神经网络和机器学习框架，包括用来创建多种网络的类，也支持神经网络需要的数据规则化及处理的类。它的训练采用多线程弹性传播。它也能使用 GPU 加快处理时间。提供了图形化界面来帮助建模和训练神经网络。
Neural Network Designer —这是一个数据库管理系统和神经网络设计器。设计器用 WPF 开发，也是一个 UI，你可以设计你的神经网络、查询网络、创建并配置聊天机器人，它能问问题，并从你的反馈中学习。这些机器人甚至可以从网络搜集信息用来输出，或是用来学习。

数据分析/数据可视化

numl —numl 这个机器学习库，目标就是简化预测和聚类的标准建模技术。
Math.NET Numerics—Math.NET 项目的数值计算基础，着眼提供科学、工程以及日常数值计算的方法和算法。支持 Windows, Linux 和 Mac 上的 .Net 4.0, .Net 3.5 和 Mono ，Silverlight 5, WindowsPhone/SL 8, WindowsPhone 8.1 以及装有 PCL Portable Profiles 47 及 344 的 Windows 8，装有 Xamarin 的 Android/iOS 。
Sho —Sho 是数据分析和科学计算的交互式环境，可以让你将脚本（IronPython 语言）和编译的代码（.NET）无缝连接，以快速灵活的建立原型。这个环境包括强大高效的库，如线性代数、数据可视化，可供任何 .NET 语言使用，还为快速开发提供了功能丰富的交互式 shell。

Python
计算机视觉

SimpleCV—开源的计算机视觉框架，可以访问如 OpenCV 等高性能计算机视觉库。使用 Python 编写，可以在 Mac、Windows 以及 Ubuntu 上运行。

自然语言处理

NLTK —一个领先的平台，用来编写处理人类语言数据的 Python 程序
Pattern—Python 可用的 web 挖掘模块，包括自然语言处理、机器学习等工具。
TextBlob—为普通自然语言处理任务提供一致的 API，以 NLTK 和 Pattern 为基础，并和两者都能很好兼容。
jieba—中文断词工具。
SnowNLP —中文文本处理库。
loso—另一个中文断词库。
genius —基于条件随机域的中文断词库。
nut —自然语言理解工具包。

通用机器学习

Bayesian Methods for Hackers —Python 语言概率规划的电子书
MLlib in Apache Spark—Spark 下的分布式机器学习库。
scikit-learn—基于 SciPy 的机器学习模块
graphlab-create —包含多种机器学习模块的库（回归，聚类，推荐系统，图分析等），基于可以磁盘存储的 DataFrame。
BigML—连接外部服务器的库。
pattern—Python 的 web 挖掘模块
NuPIC—Numenta 公司的智能计算平台。
Pylearn2—基于 Theano 的机器学习库。
hebel —Python 编写的使用 GPU 加速的深度学习库。
gensim—主题建模工具。
PyBrain—另一个机器学习库。
Crab —可扩展的、快速推荐引擎。
python-recsys —Python 实现的推荐系统。
thinking bayes—关于贝叶斯分析的书籍
Restricted Boltzmann Machines —Python 实现的受限波尔兹曼机。[深度学习]。
Bolt —在线学习工具箱。
CoverTree —cover tree 的 Python 实现，scipy.spatial.kdtree 便捷的替代。
nilearn—Python 实现的神经影像学机器学习库。
Shogun—机器学习工具箱。
Pyevolve —遗传算法框架。
Caffe —考虑了代码清洁、可读性及速度的深度学习框架
breze—深度及递归神经网络的程序库，基于 Theano。

数据分析/数据可视化

SciPy —基于 Python 的数学、科学、工程开源软件生态系统。
NumPy—Python 科学计算基础包。
Numba —Python 的低级虚拟机 JIT 编译器，Cython and NumPy 的开发者编写，供科学计算使用
NetworkX —为复杂网络使用的高效软件。
Pandas—这个库提供了高性能、易用的数据结构及数据分析工具。
Open Mining—Python 中的商业智能工具（Pandas web 接口）。
PyMC —MCMC 采样工具包。
zipline—Python 的算法交易库。
PyDy—全名 Python Dynamics，协助基于 NumPy， SciPy， IPython 以及 matplotlib 的动态建模工作流。
SymPy —符号数学 Python 库。
statsmodels—Python 的统计建模及计量经济学库。
astropy —Python 天文学程序库，社区协作编写
matplotlib —Python 的 2D 绘图库。
bokeh—Python 的交互式 Web 绘图库。
plotly —Python and matplotlib 的协作 web 绘图库。
vincent—将 Python 数据结构转换为 Vega 可视化语法。
d3py—Python 的绘图库，基于 D3.js。
ggplot —和R语言里的 ggplot2 提供同样的 API。
Kartograph.py—Python 中渲染 SVG 图的库，效果漂亮。
pygal—Python 下的 SVG 图表生成器。
pycascading

杂项脚本/iPython 笔记/代码库

pattern_classification
thinking stats 2
hyperopt
numpic
2012-paper-diginorm
ipython-notebooks
decision-weights
Sarah Palin LDA —Sarah Palin 关于主题建模的电邮。
Diffusion Segmentation —基于扩散方法的图像分割算法集合。
Scipy Tutorials —SciPy 教程，已过时，请查看 scipy-lecture-notes
Crab—Python 的推荐引擎库。
BayesPy—Python 中的贝叶斯推断工具。
scikit-learn tutorials—scikit-learn 学习笔记系列
sentiment-analyzer —推特情绪分析器
group-lasso—坐标下降算法实验，应用于（稀疏）群套索模型。
mne-python-notebooks—使用 mne-python 进行 EEG/MEG 数据处理的 IPython 笔记
pandas cookbook—使用 Python pandas 库的方法书。
climin—机器学习的优化程序库，用 Python 实现了梯度下降、LBFGS、rmsprop、adadelta 等算法。

Kaggle 竞赛源代码

wiki challange —Kaggle 上一个维基预测挑战赛 Dell Zhang 解法的实现。
kaggle insults—Kaggle 上”从社交媒体评论中检测辱骂“竞赛提交的代码
kaggle_acquire-valued-shoppers-challenge—Kaggle 预测回头客挑战赛的代码
kaggle-cifar —Kaggle 上 CIFAR-10 竞赛的代码，使用 cuda-convnet
kaggle-blackbox —Kaggle 上 blackbox 赛代码，关于深度学习。
kaggle-accelerometer —Kaggle 上加速度计数据识别用户竞赛的代码
kaggle-advertised-salaries —Kaggle 上用广告预测工资竞赛的代码
kaggle amazon —Kaggle 上给定员工角色预测其访问需求竞赛的代码
kaggle-bestbuy_big—Kaggle 上根据 bestbuy 用户查询预测点击商品竞赛的代码（大数据版）
kaggle-bestbuy_small—Kaggle 上根据 bestbuy 用户查询预测点击商品竞赛的代码（小数据版）
Kaggle Dogs vs. Cats —Kaggle 上从图片中识别猫和狗竞赛的代码
Kaggle Galaxy Challenge —Kaggle 上遥远星系形态分类竞赛的优胜代码
Kaggle Gender —Kaggle 竞赛：从笔迹区分性别
Kaggle Merck—Kaggle 上预测药物分子活性竞赛的代码（默克制药赞助）
Kaggle Stackoverflow—Kaggle 上预测 Stack Overflow 网站问题是否会被关闭竞赛的代码
wine-quality —预测红酒质量。

Ruby
自然语言处理

Treat—文本检索与注释工具包，Ruby 上我见过的最全面的工具包。
Ruby Linguistics—这个框架可以用任何语言为 Ruby 对象构建语言学工具。包括一个语言无关的通用前端，一个将语言代码映射到语言名的模块，和一个含有很有英文语言工具的模块。
Stemmer—使得 Ruby 可用 libstemmer_c中的接口。
Ruby Wordnet —WordNet 的 Ruby 接口库。
Raspel —aspell 绑定到 Ruby 的接口
UEA Stemmer—UEALite Stemmer 的 Ruby 移植版，供搜索和检索用的保守的词干分析器
Twitter-text-rb—该程序库可以将推特中的用户名、列表和话题标签自动连接并提取出来。

通用机器学习

Ruby Machine Learning —Ruby 实现的一些机器学习算法。
Machine Learning Ruby
jRuby Mahout —精华！在 JRuby 世界中释放了 Apache Mahout 的威力。
CardMagic-Classifier—可用贝叶斯及其他分类法的通用分类器模块。
Neural Networks and Deep Learning—《神经网络和深度学习》一书的示例代码。

数据分析/数据可视化

rsruby – Ruby – R bridge
data-visualization-ruby—关于数据可视化的 Ruby Manor 演示的源代码和支持内容
ruby-plot —将 gnuplot 包装为 Ruby 形式，特别适合将 ROC 曲线转化为 svg 文件。
plot-rb—基于 Vega 和 D3 的 ruby 绘图库
scruffy —Ruby 下出色的图形工具包
SciRuby
Glean—数据管理工具
Bioruby
Arel

Misc
杂项

Big Data For Chimps—大数据处理严肃而有趣的指南书

R
通用机器学习

Clever Algorithms For Machine Learning
Machine Learning For Hackers
Machine Learning Task View on CRAN—R语言机器学习包列表，按算法类型分组。
caret—R语言 150 个机器学习算法的统一接口
SuperLearner and subsemble—该包集合了多种机器学习算法
Introduction to Statistical Learning

数据分析/数据可视化

Learning Statistics Using R
ggplot2—基于图形语法的数据可视化包。

Scala
自然语言处理

ScalaNLP—机器学习和数值计算库的套装
Breeze —Scala 用的数值处理库
Chalk—自然语言处理库。
FACTORIE—可部署的概率建模工具包，用 Scala 实现的软件库。为用户提供简洁的语言来创建关系因素图，评估参数并进行推断。

数据分析/数据可视化

MLlib in Apache Spark—Spark 下的分布式机器学习库
Scalding —CAscading 的 Scala 接口
Summing Bird—用 Scalding 和 Storm 进行 Streaming MapReduce
Algebird —Scala 的抽象代数工具
xerial —Scala 的数据管理工具
simmer —化简你的数据，进行代数聚合的 unix 过滤器
PredictionIO —供软件开发者和数据工程师用的机器学习服务器。
BIDMat—支持大规模探索性数据分析的 CPU 和 GPU 加速矩阵库。

通用机器学习

Conjecture—Scalding 下可扩展的机器学习框架
brushfire—scalding 下的决策树工具。
ganitha —基于 scalding 的机器学习程序库
adam—使用 Apache Avro, Apache Spark 和 Parquet 的基因组处理引擎，有专用的文件格式，Apache 2 软件许可。
bioscala —Scala 语言可用的生物信息学程序库
BIDMach—机器学习 CPU 和 GPU 加速库.

http://www.08kan.com/gwk/MzA4NjA4MTkzMw/203089745/1/ca764aacc4c3601a8a608e219929ac1b.html

Tags 机器学习

木有技术

Shadowsocks相关资料（下载、安装、更新）

Post author By idailylife
Post date 2015年3月11日
No Comments on Shadowsocks相关资料（下载、安装、更新）

1、Shadowsocks 是什么
Shadowsocks 是一种安全的 socks5 代理，可以保护你的上网流量。基于多种加密方式，推荐使用 aes-256-cfb 加密。安装和使用需要本地端和服务端。
本地客户端已经包含了多种版本，包括iOS，Android，Windows，MAC，甚至是路由器 (基于OpenWRT)，所以使用方便，各取所需。
远程服务端则一般安装在基于 Linux 的各种发行版操作系统，比如 Debian， CentOS， Fedora， Redhat， Ubuntu， openSUSE等。
2、Shadowsocks 的作者是谁
最初只有 Python 版，由 @clowwindy 开发和维护，后来随着知名度提高，开始出现各种语言的版本，其中比较知名的是 libev ， go， nodejs 等版本，需要注意的是，nodejs 的作者也是 @clowwindy，但最近已不再维护该版本。libev 的维护者是 @madeye，长期更新。
3、Shadowsocks 一键安装脚本
虽然作者的安装教程已经很完备了，但还是有不少人不会安装和使用，因此我编写了 Shadowsocks 一键安装脚本，主要基于 CentOS 系统下的一键安装（也有 Debian 系统），分别有 Python， libev， nodejs 版，个人推荐使用 Python 和 libev 版。
该脚本会自动下载，编译安装最新版的 Shadowsocks，并且可以完全卸载 Shadowsocks ，自动生成配置文件，安装完成即可使用。
4、Shadowsocks 如何升级
安装完 Shadowsocks，一段时间后，作者已经更新了版本（修正 bug 或升级功能），那么如何一键升级到最新版呢？
Python 版，执行命令： pip install -U shadowsocks ，命令执行成功后，重新启动 Shadowsocks ，命令： service shadowsocks restart
libev 版，先卸载旧版本，执行命令：./shadowsocks-libev.sh uninstall ，再安装新版本，执行命令： ./shadowsocks-libev.sh 安装
Debian 下的 libev 版，升级方式同上。
nodejs 版，也是先卸载再重新安装，因为作者已经不更新，所以不推荐用这版本。
5、Shadowsocks 安装失败怎么办
由于 CentOS 5.x 的默认 gcc 版本过低，在编译 libev 版时会出错；同时默认 Python 的版本也过低，所以也无法安装 Python 版的。因此，请确保安装的环境为 CentOS 6.x 或 CentOS 7.x 。
更多其他错误，请根据实际错误提示，自行在 google 上搜索关键字。
6、Shadowsocks 的客户端程序
大多数人都是在 Windows 下使用电脑，因此最好用的 Windows 客户端是 shadowsocks-gui ，下载最新版后解压即可使用。
7、Shadowsocks 本地代理上网
本地电脑启动客户端，连接上远程服务端后，即在本地开启了 socks5 代理，本地端口号默认为 1080，如果提示被占用，也可以改为其他端口号。在浏览器中安装插件，Chrome 下是 SwitchySharp， Firefox 下是 AutoProxy，新建配置文件，SOCKS Host 填 127.0.0.1，Port 填 1080（默认，跟 Shadowsocks 客户端的本地端口号一致即可）
参考链接：
1、http://shadowsocks.org/en/index.html
2、https://github.com/clowwindy/shadowsocks
3、https://github.com/madeye/shadowsocks-libev

本文来自：http://teddysun.com/372.html

Tags shadowsocks, VPN

生活琐碎

心累

干活干得好累，真想放一周的假。

不学无术

1074. Reversing Linked List (25) 链表反转~最后一个测试点，小心特殊情况！

Post author By idailylife
Post date 2015年2月26日
No Comments on 1074. Reversing Linked List (25) 链表反转~最后一个测试点，小心特殊情况！

原题：

Given a constant K and a singly linked list L, you are supposed to reverse the links of every K elements on L. For example, given L being 1→2→3→4→5→6, if K = 3, then you must output 3→2→1→6→5→4; if K = 4, you must output 4→3→2→1→5→6.
Input Specification:
Each input file contains one test case. For each case, the first line contains the address of the first node, a positive N (<= 10⁵) which is the total number of nodes, and a positive K (<=N) which is the length of the sublist to be reversed. The address of a node is a 5-digit nonnegative integer, and NULL is represented by -1.
Then N lines follow, each describes a node in the format:
Address Data Next
where Address is the position of the node, Data is an integer, and Next is the position of the next node.
Output Specification:
For each case, output the resulting ordered linked list. Each node occupies a line, and is printed in the same format as in the input.
Sample Input:

00100 6 4
00000 4 99999
00100 1 12309
68237 6 -1
33218 3 00000
99999 5 68237
12309 2 33218

Sample Output:

00000 4 33218
33218 3 12309
12309 2 00100
00100 1 99999
99999 5 68237
68237 6 -1

分析：

初看是一道很简单的题目，对链表进行有规律的反转。我采取的方案如下：
1.读取输入，然后链表按其地址存放到map中（当然有猥琐的人可以存放到数组里用空间换时间）；
2.按照链表的头指针顺着下去读取链表。建一个栈，当没有达到K个时，每个读取到的值压入栈里，然后集齐K个的时候输出。技巧是一行输出的东西可以分拆分成两块，即除第一行与最后一行特殊外，其他每两行都是:
[上一地址 ] [上个值 ] [当前地址]
[当前地址] [当前值] [下一地址]
也就是说得到当前节点后，只需要补齐上一行的”当前地址”和当前行前两项即可，等到下一个节点读入时再补上这一行的最后一个“下一地址”。
3.最后，末尾别忘了NULL指针“-1”
上面的算法复杂度是O(N)

但是请务必考虑下面几个特殊情况：
1. K=1, K=N
2. 给的头指针不是整条链的头指针，而是中间某个节点的。这个问题是最后一个测试点测试的东西。我一开始也试了好多无果，还好搜到了这篇文章末尾的评论部分才得到启发！另外测试点6不会考虑给的测试例有多个next指针式NULL指针（-1）的情况，有些博客中这种说法是错误的。
例如：

00000 6 4
00000 4 99999
00100 1 12309
68237 6 -1
33218 3 00000
99999 5 68237
12309 2 33218

这时候的正确输出是：

00000 4 99999
99999 5 68237
68237 6 -1

代码

#include <iostream>
#include <map>
#include <stack>
#include <string.h>
using namespace std;
struct Node
{
	char fake_addr[6];
	char fake_next[6];
	int data;
};
int main()
{
	char pStart[6];
	int N, K;
	cin >> pStart >> N >> K;
	map<int, Node> mapNodes;
	int multiEnd = 0;
	for (int i = 0; i < N; i++)
	{
		//Read from input
		char addr[6], next[6];
		int data;
		cin >> addr >> data >> next;
		if (strcmp(next, "-1") == 0)
			multiEnd++;
		Node node;
		node.data = data;
		strcpy(node.fake_addr, addr);
		strcpy(node.fake_next, next);
		int key = atoi(addr);
		mapNodes[key] = node;
	}
	if (true)
	{
		//奇怪的的情况，需要扫链了
		int count = 0;
		int nextPtr = atoi(pStart);
		while (nextPtr != -1)
		{
			nextPtr = atoi(mapNodes[nextPtr].fake_next);
			count++;
		}
		N = count;
	}
	stack<Node> workingStack;
	const int LIMIT = N - N % K;
	int currentFakePtr = atoi(pStart);
	bool firstLine = true;
	//能整除的范围
	for (int i = 0; i < LIMIT; i++)
	{
		if (true)
		{
			//将node压入栈准备输出
			if (currentFakePtr == -1)
			{
				break;
			}
			Node currentNode = mapNodes[currentFakePtr];
			workingStack.push(currentNode);
			currentFakePtr = atoi(currentNode.fake_next);
		}
		if (i%K == K-1)
		{
			//逐条弹栈并输出 (除最后一个节点的next地址)
			while (!workingStack.empty())
			{
				Node currentNode = workingStack.top();
				if (!firstLine)
					cout << currentNode.fake_addr << endl; //上一行的末尾
				cout << currentNode.fake_addr << " " << currentNode.data << " "; //本行的前两个元素
				workingStack.pop();
				firstLine = false;
			}
		}
	}
	//不能整除的范围，按顺序输出
	//需要考虑一开始就不能整除的情况 (LIMIT = 0)
	for (int i = LIMIT; i < N; i++)
	{
		if (currentFakePtr == -1)
		{
			break;
		}
		Node currentNode = mapNodes[currentFakePtr];
		if (!firstLine)
			cout << currentNode.fake_addr << endl; //上一行的末尾
		cout << currentNode.fake_addr << " " << currentNode.data << " "; //本行的前两个元素
		currentFakePtr = atoi(currentNode.fake_next);
		firstLine = false;
	}
	cout << "-1";
	return 0;
}

测试点

（这个结果还有优化的空间，不过既然已经<400ms了我就不管啦~）

测试点	结果	用时(ms)	内存(kB)	得分/满分
0	答案正确	1	360	12/12
1	答案正确	1	232	3/3
2	答案正确	1	360	2/2
3	答案正确	1	232	2/2
4	答案正确	1	232	2/2
5	答案正确	326	8424	3/3
6	答案正确	1	360	1/1

Tags PAT

不学无术

1078. Hashing (25) ：：哈希表二次探测法|质数判定

Post author By idailylife
Post date 2015年2月25日
No Comments on 1078. Hashing (25) ：：哈希表二次探测法|质数判定

原题http://www.patest.cn/contests/pat-a-practise/1078：
The task of this problem is simple: insert a sequence of distinct positive integers into a hash table, and output the positions of the input numbers. The hash function is defined to be “H(key) = key % TSize” where TSize is the maximum size of the hash table. Quadratic probing (with positive increments only) is used to solve the collisions.
Note that the table size is better to be prime. If the maximum size given by the user is not prime, you must re-define the table size to be the smallest prime number which is larger than the size given by the user.
Input Specification:
Each input file contains one test case. For each case, the first line contains two positive numbers: MSize (<=10⁴) and N (<=MSize) which are the user-defined table size and the number of input numbers, respectively. Then N distinct positive integers are given in the next line. All the numbers in a line are separated by a space.
Output Specification:
For each test case, print the corresponding positions (index starts from 0) of the input numbers in one line. All the numbers in a line are separated by a space, and there must be no extra space at the end of the line. In case it is impossible to insert the number, print “-” instead.
Sample Input:

4 4
10 6 4 15

Sample Output:

0 1 4 -

这个题目主要是两个问题：

质数的查找。质数查找采用比较取巧的笨办法，1000以下用质数表，1000以上的用土办法（除以 2~根号X一个个试）；
哈希表冲突的解决，题目中明确写了使用Quadratic probing(positive increments only)，即序号递增的那种二次探测法。具体细节就不多说了，可以参考这里、这里和这里。数据结构荒废多年，自己竟然还要查资料，也是挺不好意思的。

我的代码（c++）：

#include <iostream>
#include <cmath>
using namespace std;
int prime_1000[] = { 2, 3, 5, 7, 11, 13, 17, 19, 23, 29,
31, 37, 41, 43, 47, 53, 59, 61, 67, 71,
73, 79, 83, 89, 97, 101, 103, 107, 109, 113,
127, 131, 137, 139, 149, 151, 157, 163, 167, 173,
179, 181, 191, 193, 197, 199, 211, 223, 227, 229,
233, 239, 241, 251, 257, 263, 269, 271, 277, 281,
283, 293, 307, 311, 313, 317, 331, 337, 347, 349,
353, 359, 367, 373, 379, 383, 389, 397, 401, 409,
419, 421, 431, 433, 439, 443, 449, 457, 461, 463,
467, 479, 487, 491, 499, 503, 509, 521, 523, 541,
547, 557, 563, 569, 571, 577, 587, 593, 599, 601,
607, 613, 617, 619, 631, 641, 643, 647, 653, 659,
661, 673, 677, 683, 691, 701, 709, 719, 727, 733,
739, 743, 751, 757, 761, 769, 773, 787, 797, 809,
811, 821, 823, 827, 829, 839, 853, 857, 859, 863,
877, 881, 883, 887, 907, 911, 919, 929, 937, 941,
947, 953, 967, 971, 977, 983, 991, 997, 1009 };
int findSmallestPrime_(int biggerThan)
{
	//比1000大的在这边处理
	int currNum = biggerThan;
	while (true)
	{
		bool primeFlag = true;
		for (int n = 2; n <= (int)sqrt(currNum); n++)
		{
			if (currNum%n == 0)
			{
				primeFlag = false;
				break;
			}
		}
		if (primeFlag)
			return currNum;
		currNum++;
	}
}
int findSmallestPrime(int biggerThan)
{
	if (biggerThan < 1000)
	{
		// <1000的直接查表
		for (int i = 0; i < 169; i++)
		{
			if (prime_1000[i] < biggerThan)
				continue;
			else
				return prime_1000[i];
		}
	}
	else
		findSmallestPrime_(biggerThan);
}
int getHashPos(int* hashTable, int Tsize, int val)
{
	//如果塞不进去则返回-1，否则返回位置
	//使用Quadratic probing
	int probIndex = val % Tsize;
	int H = probIndex;
	int trialCount = 1;
	while (hashTable[probIndex] != -1
		&& trialCount < Tsize)
	{
		probIndex = (val + trialCount * trialCount) % Tsize;
		trialCount++;
	}
	if (trialCount >= Tsize)
		return -1;
	hashTable[probIndex] = val;
	return probIndex;
}
int main()
{
	int M, N;
	cin >> M >> N;
	int Tsize = findSmallestPrime(M);
	int* hashTable = new int[Tsize];
	for (int i = 0; i < Tsize; i++)
		hashTable[i] = -1;
	for (int i = 0; i < N; i++)
	{
		int currVal;
		cin >> currVal;
		int pos = getHashPos(hashTable, Tsize, currVal);
		if (pos == -1)
			cout << "-";
		else
			cout << pos;
		if (i < N - 1)
			cout << " ";
	}
	return 0;
}

测试点	结果	用时(ms)	内存(kB)	得分/满分
0	答案正确	1	360	12/12
1	答案正确	1	360	3/3
2	答案正确	1	232	5/5
3	答案正确	20	360	5/5

最后一个测试点应该是比较大的数

Tags PAT

不学无术木有技术

1065. A+B and C (64bit) (20)

Post author By idailylife
Post date 2015年2月22日
No Comments on 1065. A+B and C (64bit) (20)

http://www.patest.cn/contests/pat-a-practise/1065

原题如下

Given three integers A, B and C in [-2⁶³, 2⁶³], you are supposed to tell whether A+B > C.
Input Specification:
The first line of the input gives the positive number of test cases, T (<=10). Then T test cases follow, each consists of a single line containing three integers A, B and C, separated by single spaces.
Output Specification:
For each test case, output in one line “Case #X: true” if A+B>C, or “Case #X: false” otherwise, where X is the case number (starting from 1).
Sample Input:

3
1 2 3
2 3 4
9223372036854775807 -9223372036854775808 0

Sample Output:

Case #1: false
Case #2: true
Case #3: false

分析，其实就是自己做加减法进位的问题，需要考虑正负号的细节。
正负数加减法的规则可以参考百度文库，一大堆小学生教材，哈哈~
大概是这样，比较两个数a和b：

最终结果符号的判定：如果|a|>=|b|那么结果的符号与a相同，反之符号与b相同；
数值计算，不管正负号，用绝对值大的那个做操作数1，绝对值小的做操作数2，如果a，b同号做操作数1+操作数2，异号做操作数1–操作数2；

我的代码（好久没写c++，各种复杂，见谅）
其中isBiggerAbs是判断a与b的绝对值大小的，isBigger是判断实际值大小的，swapss是交换元素。
另外0x30是字符’0’的ASCII码来着~输入的时候是按照字符流看待的，计算的时候转换成了数字，然后比较的时候为了和c比方便又转换回去了。
另外给的Sample Input里面那个一长串的就是上限和下限了，加上符号20位足够放。

#include <iostream>
#include <string.h>
using namespace std;
void swapss(char*a, char* b)
{
	char t[20];
	strcpy(t, a);
	strcpy(a, b);
	strcpy(b, t);
}
bool isBiggerAbs(char* a, char* b)
{
	int len_a, len_b;
	len_a = strlen(a);
	len_b = strlen(b);
	if (len_a > len_b)
		return true;
	else if (len_a < len_b)
		return false;
	//只需比较位数一样的
	for (int i = 0; i < len_a; i++)
	{
		if (a[i] > b[i])
			return true;
		else if (a[i] < b[i])
			return false;
	}
	//完全相等
	return false;
}
bool isBigger(char* a, char* b)
{
	//Judge if a > b
	bool neg_a = (a[0] == '-');
	bool neg_b = (b[0] == '-');
	if (neg_a && !neg_b)
		return false;
	else if (!neg_a && neg_b)
		return true;
	if (!neg_a)
		return isBiggerAbs(a, b);
	else
	{
		a = strtok(a, "-");
		b = strtok(b, "-");
		return !isBiggerAbs(a, b);
	}
}
void bigPlus(char* a, char* b, char* r)
{
	// c = a + b
	int len_a, len_b;
	bool isNeg_a = false;
	bool isNeg_b = false;
	bool isNeg_r = false;
	if (a[0] == '-')
	{
		char* pch = strtok(a, "-");
		a = pch;
		isNeg_a = true;
	}
	if (b[0] == '-')
	{
		char* pch = strtok(b, "-");
		b = pch;
		isNeg_b = true;
	}
	if (!isBiggerAbs(a, b))
	{
		//Swap a and b
		swapss(a, b);
		isNeg_r = isNeg_b;
	}
	else
		isNeg_r = isNeg_a;
	if (isNeg_a)
	{
		bool t = isNeg_a;
		isNeg_a = isNeg_b;
		isNeg_b = t;
	}
	len_a = strlen(a);
	len_b = strlen(b);
	int index_a = len_a - 1;
	int index_b = len_b - 1;
	int remainder = 0;
	int count = 0;
	while (index_a >= 0 || index_b >= 0)
	{
		int op0 = 0;
		if (index_a >=0 )
			op0 = (int)a[index_a] - 0x30;
		if (isNeg_a)
			op0 = -op0;
		int op1 = 0;
		if (index_b >= 0)
			op1 = (int)b[index_b] - 0x30;
		if (isNeg_b)
			op1 = -op1;
		int result = op0 + op1 + remainder;
		if (result < 0)
		{
			remainder = -1; //negative raminder (<'0')
			result += 10;
		}
		else if (result > 9)
		{
			remainder = 1; //positive remainder (>'9')
			result -= 10;
		}
		else
			remainder = 0;
		r[count++] = (char)(result + 0x30);
		index_a--;
		index_b--;
	}
	//Deal with the last remainder
	if (remainder > 0)
	{
		r[count++] = (char)(remainder+0x30);
	}
	else if (remainder < 0)
	{
		r[count++] = (char)(remainder + 0x30);
	}
	if (isNeg_r)
		r[count++] = '-';
	char temp[21];
	int t = 0;
	while ((--count) >= 0)
	{
		temp[t++] = r[count];
	}
	temp[t] = '\0';
	strcpy(r, temp);
}
int main()
{
	//Read huge integer as charset
	int T; //Nubmer of test cases
	cin >> T;
	char a[20];
	char b[20];
	char c[20];
	char result[21];
	for (int i = 0; i < T; i++)
	{
		//Deal with test cases
		cin >> a >> b >> c;
		bigPlus(a, b, result);
		bool is_bigger = isBigger(result, c);
		cout << "Case #" << i + 1 << ": ";
		if (is_bigger)
			cout << "true";
		else
			cout << "false";
		if (i < T - 1)
			cout << endl;
	}
	return 0;
}

结果

评测结果

时间	结果	得分	题目	语言	用时(ms)	内存(kB)	用户
2月22日 20:46	答案正确	20	1065	C++ (g++ 4.7.2)	1	360

测试点

测试点	结果	用时(ms)	内存(kB)	得分/满分
0	答案正确	1	360	12/12
1	答案正确	1	360	4/4
2	答案正确	1	360	4/4

Tags C++, PAT, 算法基础

生活琐碎

2014年12月大学英语四六级考试成绩发布通知

一、发布时间:2015年2月28日上午9时

　　二、成绩查询方式
1.网上免费查分：
网址:cet.99sushe.com 运营商：99宿舍网客服：99宿舍在线客服
网址:www.chsi.com.cn/cet 中国高等教育学生信息网
2.收费短信查分：
查询方式：中国移动、联通、电信手机用户:
发送A加15位准考证号（如A110010132100101）到1066335577查询成绩。
资费标准：1元/条，不含通信费。
关于2014年11月全国大学英语四、六级考试口语考试成绩公布的通知
一、发布时间:2015年2月28日上午9时
二、成绩查询方式
网址:cet.99sushe.com 运营商：99宿舍网客服：99宿舍在线客服
网址:chaxun.neea.edu.cn 教育部考试中心综合查询网
全国大学英语四、六级考试委员会办公室

不学无术

组合递归

荒废计算机太久了，以至于这么简单一个问题想了我大半天，其实就是数组求组合输出的方法，写下此日志提醒自己不能忘了老本!
题目的话大概可以这样表述

给出一个数据A={a_0,a_1,a_2…a_n}(其中n可变)，打印出该数值元素的所有组合

需要使用递归的方法解决比较好，分而治之

给一大团数据{a0,a1, a2, …, an}：
1. 如果就剩一个元素的话，我就输出它，结束；
2.不输出a0，把除了我之外的余下一团交由递归处理；
3.要输出a0自己，然后余下一团交由递归处理；

其中case 3里面需要注意下一个递归要记住a0是要输出的，所以实践起来就是带了个前缀一样的东西记住。感觉这种排列组合的问题应该也可以用栈来解决，但归根结底还是分治。
用Python随意写了一个：

# coding:utf-8
def foo(inary, appendix=''):
    if len(inary) == 0:
        return
    #Case 0
    print appendix + str(inary[0])
    if len(inary) < 2:
        return
    #Case 1
    foo(inary[1:], appendix)
    #Case 2
    #print str(inary[0]),
    foo(inary[1:], appendix+str(inary[0]))

测试输出：

>>> foo([1, 2, 3, 4])
1
2
3
4
34
23
24
234
12
13
14
134
123
124
1234

Tags Python, 算法

木有技术

[Updated on Aug.29] ThinkPad X220/X230 Full HD IPS modification

Post author By idailylife
Post date 2015年2月10日
12 Comments on [Updated on Aug.29] ThinkPad X220/X230 Full HD IPS modification

[CHINESE] Reference of the detail/ Source of this article: http://forum.51nb.com/viewthread.php?tid=1548345&extra=&highlight=&page=1
[CHINESE] Forum page to buy this modification suite: http://forum.51nb.com/viewthread.php?tid=1548947&extra=&highlight=&page=1
[CHINESE] FHD modification for T420/430 series are also available via a customed LVDS to eDP board: http://forum.51nb.com/thread-1552978-1-1.html In plain terms, the EDID information must be coded on the external LVDS to eDP board so that no BIOS modification are needed.
Firstly, I am NOT the inventor of this method, he who achieved this modification did not give details about his modification in some way. I am only a customer who bought his modified adapter.
I’m just trying to guess what he did to my motherboard and some of my descriptions below might be totally wrong and yes I will mark them specially.

Here are the main(brief) ideas of the following modification:

Renounce the video signals produced by original LVDP lane because it cannot support a screen resolution higher than 1440×900
Use the signal output on dock station to transfer data because it supports DisplayPort
(I’m not quite sure about this part)’Hijack’ the signal of original LVDS lane (33 PANEL_BKLT_CTRL and 33 BACKLIGHT_ON ) input so as to control the backlight of the new screen panel

An Overview of LVDS/eDP Adapter

Modifications on Motherboard(Video Signals, w/o backlight control)

Yes, there is an applicable solution to modify(replace) the built-in LCD screen to Full HD resolution (1920 * 1080). Yes, it still supports one other external screen either analogical(VGA) or digital(DisplayPort).

Firstly I must declare that I am not the inventor of this solution, and I personally hold no responsibility to any thing that might happen on your laptop.
kingkonglue in forum.51nb.com is the one who invented this solution, I take no credit for the solution introduced below.

Background Info

As almost everyone knows, the X220/X230 series uses a one-lane(single channel) LVDP lane to transmit video signal to the build-in screen panel, which only supports a screen resolution lower than 1440*900. This is the main obstacle to install a FHD panel on the laptop.
There are at least 3 solutions available to deal with this problem.
Leokim is the first person to achieve this goal, he successfully modified his X220 with a Full HD panel (which seems to be a touchable screen toke apart from a Dell XPS 12” series, LP125WF1). His solution is not quite clearly explained, what we know till now is the modified BIOS and signal wire and the BIOS modification is hard for X230 series due to the encryption of UEFI BIOS.
The second trial of such modification is done by 东大师(Dong Da-shi). Presumably he used a self-made LVDS–>EDP adapter to fit the screen panel , and added some signal wires to the original LVDP signal wire in order to support higher resolution. This solution has a chance to encounter signal interference because of the modified analogue signal wire(LVDS).
And I myself heard somehow there is another `shortcut` toward such modification through the interface of dock. As we know there are two digital video output available through the external dock and hence we can make use of one of them. However, problem remains on the backlight control of the panel.

Screen Panel

In 2014 there are more high resolution panels available for the 12.5″ X series. After Dell XPS, ThinkPad X24o starts to use eDP video lane instead of LVDS for the sake of FHD support. As is written above, filling the gap between LVDS and eDP is the key to our modification task and there exist at least two main problem on the way of achievement.
BTW the model of screen panel that I used is LP125WF2 SPB1 (I guess that all `LP125WF2` series are compatible).

FHD Signal

WIP

Backlight Control

WIP

Problems Remained

(Updated on Aug 29 2015)
Although we have successfully connected the FHD panel onto the mainboard, small problems still exists in everyday use. The following problems might occur:

A Higher Temperature on GPU/CPU, especially when you attach a external screen. This is because that the original output is still alive, actually the system will recognize TWO screen panels with the original one disabled partially. The same video signal will still be outputted to the original port. When you attach a 2nd screen (like a external monitor) the system will have to maintain the screen output of 3 panels actually.
Bugs on the Driver –> 16-bit-like color depth (on X230 series; the X220 series are exempted). I was encountered with this trouble as many other people who holds the x230 laptop. However, x220 series will not be affected if it uses a modified driver provided by the inventor of the adapter. This bug will cause a 16-bit-like color output on the screen panel, such that the gradual change of one color will not be so smooth as it was. If anyone remember the 16-bit color depth on the earlier LCD panels, that’s the feel. The root of this bug lies in the display driver, but no solutions has been ever given till now.
Black Screen on the First Attach of External Screen. The earlier version of adapter (like what I owned) will encounter this bug, but it is reported solved by an upgrade to the chip on the adapter (on the hardware I mean).

WORK IN PROGRESS.
THIS ARTICLE WILL BE WRITTEN UP IF I HAVE ANY SPARE TIME.

Tags FHD, thinkpad, X230