1 | with temp_table as ( |
count distinct 结果是字段均不为null的去重值
所以在通过count distinct判断字段的重复情况时,注意把相关字段的null附上值
1 | select count(*) |
关联字段中有null时是关联不上的,即使左右都是null
1 | select a.f1, a.f2, b.f1, b.f2 |
sum中包含null,不会影响sum的值
1 | select sum(ct) |
1 | with temp_table as ( |
所以在通过count distinct判断字段的重复情况时,注意把相关字段的null附上值
1 | select count(*) |
1 | select a.f1, a.f2, b.f1, b.f2 |
1 | select sum(ct) |
1 | select count(distinct seller_bu_id) as ct1, |
指定排序规则
现象一:写入时间在mysql中查看,与实际时间相差几个小时,程序读出来又是正常的
MySQL数据时区问题,及datetime和timestamp类型存储的差异
MySQL Config–参数system_time_zone和参数time_zone
查询系统分区:
1 | show global variables like'%time_zone%' |
解决方案:
1 | jdbc:mysql://10.48.204.231:5002/waimai_hubble_analysis_test?useUnicode=true&useSSL=false&serverTimezone=GMT%2B8 |
[toc]
如果有特殊字符,导致解析失败
用单引号引起来,这样会原封不动的使用,而不会尝试去解析
1 | curl -k -v -X POST "https://xxx.net/livy/batches" -u 'admin:xxx!QAZ' -H "Content-Type: application/json" -H "X-Requested-By: admin" -d '{"file":"/users/jingqicao/sparkSubmission/SAMStreaming/ver-01/bin/SAM-1.0-SNAPSHOT.jar", "driverMemory": "30G", "driverCores": 4, "executorCores": 14, "executorMemory": "98G", "numExecutors": 256, "name": "SAMStreaming-ver01-1020-02", "className":"com.microsoft.sam.SAMJobRunner", "args":["ver-01-1020-02","/users/jingqicao/sparkSubmission/SAMStreaming/ver-01/bin/SAMJob_4HD.conf"]}' |
nautilus
1 | ps aux |
1 | # 1. |