This Jekyll template totally compatible with Markdown syntax. Now, let’s take a look for the text and typography in this theme. Titles H1 H2 H3 H4 Paragraph I wandered lonely as a cloud ...
Text and Typography
This Jekyll template totally compatible with Markdown syntax. Now, let’s take a look for the text and typography in this theme. Titles H1 H2 H3 H4 Paragraph I wandered lonely as a cloud ...
Structured Streaming with ElasticSearch
Maven依赖 <dependency> <groupId>org.elasticsearch</groupId> <artifactId>elasticsearch-spark-20_2.11</artifactId> <version>6.2.4</version> </de...
Spark shuffle
Hash Shuffle V1 总文件数 M*R Hash Shuffle V2 M(Executor)*R 500个map task 分配10个Executor,每个Executor一个core,每个Executor分配50个task, 则总文件数10*R Sort Shuffle V1 Sort Shuffle V2
Spark stage划分
rdd–>job–>stage–>task sc.parallelize(1 to 10000, 2).map { i => Thread.sleep(10); i }.count() def parallelize[T: ClassTag]( seq: Seq[T], numSlices: Int = defaultParallelism): RDD...
float&double&decimal精度损失
起初是同事在用spark sql时候反馈的一个问题,sql如下 %hive select 10900*now_pocket_limit_rate as d from dp_fk_tmp.dszx_88w_v2 where uid='41195' 这个now_pocket_limit_rate在表里的值为0.7,返回结果为: 7629.999999999999 此时脑海居然浮现出之...
Spark Thrift 原理
hive jdbc client 客户端执行sql入口statement.execute(sql); HiveStatement.java public boolean execute(String sql) throws SQLException { runAsyncOnServer(sql); TGetOperationStatusResp status = waitForOp...
Building Zeppelin from source
编译,这里使用的zeppelin-0.8.0的zip包,如果直接使用0.8.0的all包注意jdk版本不能低于jdk1.8.0_144 ,反编译javax.ws是基于这个版本编译的 cd zeppelin-0.8.0 mvn clean package -Pbuild-distr -DskipTests -Denforcer.skip=true -Dcheckstyle.skip=t...
Etcd snapshot
etcd的消息和日志都是在raftNode.start()启动的协程里面处理的(持久化) //etcdserver/raft.go // start prepares and starts raftNode in a new goroutine. It is no longer safe // to modify the fields after it has been started. ...