生而为人

程序员的自我修养

0%

How to Avoid the ReceiverDisconnectedException

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
21/08/19 02:18:45 ERROR TaskSetManager [task-result-getter-0]: Task 32 in stage 73.0 failed 4 times; aborting job
21/08/19 02:18:45 ERROR FileFormatWriter [stream execution thread for [id = 5f382b82-2006-49a4-a657-5512d5bc6cce, runId = e783c09d-8ea7-4848-b018-2422590426ce]]: Aborting job 6d36e286-ecb0-4a8e-9928-5ac7a0fbb7d2.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 32 in stage 73.0 failed 4 times, most recent failure: Lost task 32.3 in stage 73.0 (TID 18190, wn126-msnbi.awfbdxsze1iudhhki0l2sbzfaf.bx.internal.cloudapp.net, executor 8): java.util.concurrent.CompletionException: com.microsoft.azure.eventhubs.ReceiverDisconnectedException: New receiver 'spark-26-18237' with higher epoch of '0' is created hence current receiver 'spark-8-18190' with epoch '0' is getting disconnected. If you are recreating the receiver, make sure a higher epoch is used. TrackingId:ddef38220000007a00008bf9611dbf7b_G4_B16, SystemTracker:msnsam-prod:eventhub:sambeacon-ueq~4223|$default, Timestamp:2021-08-19T02:18:45, errorContext[NS: msnsam-prod.servicebus.windows.net, PATH: sambeacon-ueq/ConsumerGroups/$Default/Partitions/32, REFERENCE_ID: LN_23ed71_1629339510773_d5c9_G4, PREFETCH_COUNT: 500, LINK_CREDIT: 200, PREFETCH_Q_LEN: 0]
at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
at java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:607)
at java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:591)
at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488)
at java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1990)
at com.microsoft.azure.eventhubs.impl.ExceptionUtil.completeExceptionally(ExceptionUtil.java:116)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.drainPendingReceives(MessageReceiver.java:505)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.onError(MessageReceiver.java:490)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.onClose(MessageReceiver.java:790)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.processOnClose(BaseLinkHandler.java:73)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.handleRemoteLinkClosed(BaseLinkHandler.java:109)
at com.microsoft.azure.eventhubs.impl.BaseLinkHandler.onLinkRemoteClose(BaseLinkHandler.java:51)
at org.apache.qpid.proton.engine.BaseHandler.handle(BaseHandler.java:176)
at org.apache.qpid.proton.engine.impl.EventImpl.dispatch(EventImpl.java:108)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.dispatch(ReactorImpl.java:324)
at org.apache.qpid.proton.reactor.impl.ReactorImpl.process(ReactorImpl.java:291)
at com.microsoft.azure.eventhubs.impl.MessagingFactory$RunReactor.run(MessagingFactory.java:784)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.microsoft.azure.eventhubs.ReceiverDisconnectedException: New receiver 'spark-26-18237' with higher epoch of '0' is created hence current receiver 'spark-8-18190' with epoch '0' is getting disconnected. If you are recreating the receiver, make sure a higher epoch is used. TrackingId:ddef38220000007a00008bf9611dbf7b_G4_B16, SystemTracker:msnsam-prod:eventhub:sambeacon-ueq~4223|$default, Timestamp:2021-08-19T02:18:45, errorContext[NS: msnsam-prod.servicebus.windows.net, PATH: sambeacon-ueq/ConsumerGroups/$Default/Partitions/32, REFERENCE_ID: LN_23ed71_1629339510773_d5c9_G4, PREFETCH_COUNT: 500, LINK_CREDIT: 200, PREFETCH_Q_LEN: 0]
at com.microsoft.azure.eventhubs.impl.ExceptionUtil.toException(ExceptionUtil.java:43)
at com.microsoft.azure.eventhubs.impl.MessageReceiver.onClose(MessageReceiver.java:789)
... 15 more

Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2065)
at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:167)
at org.apache.spark.sql.execution.streaming.FileStreamSink.addBatch(FileStreamSink.scala:131)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5$$anonfun$apply$17.apply(MicroBatchExecution.scala:537)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5.apply(MicroBatchExecution.scala:535)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:534)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:198)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:166)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:281)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193)

doris的分区规则

为什么表结构的分区与分区字段里去重值数量不一样?

guide line

best practise

新建cluster注意事项:

  1. 指定独立的VNet,保证可以做灵活的网络管控
  2. 对于节点多的cluster,为了保证Ambari监控性能,最好申请独立的DB,并且提高其等级,机器越多建议提高的等级越多。

d:\project\APGold\autopilotservice\Global\VirtualEnvironments\AdsBI\AdsMz-Test-MW1-MSNMediation>sd edit deployment.int
deployment.int - file(s) not on client.

直接编辑

  1. add/edit
  2. 安装codeflow 提交pr
    \codeflow.redmond.corp.microsoft.com\public\cf2Launcher.cmd

codeflow作用

ve作用:
pe作用:

之前

D:\project\APGold\autopilotservice\Global\VirtualEnvironments\AdsBI\AdsMz-Test-MW1-MSNMediation

D:\project\APGold\autopilotservice\MW1\AdsMz-Prod-MW1

A ServiceGroup is mandatory for every machine function

VE中存储的可以理解为某个项目的global config,所有的PE公用这部分config
设计目的是在同一项目部署多个PE时,避免多次发布,只需要更新VE,相关PE就会生效

https://msazure.visualstudio.com

D:\project\APGold\autopilotservice\Global\VirtualEnvironments\AdsBI\AdsMz-Prod-MW1-MSNMediation

1
07/26/21 16:54:01.841,error processing AdsMz-Test-MW1: Rollout 'AdsBI-AdsMz-Test-MW1-MzOrchestration-VE.6799404_17553440741849145769_0.Rollout_AdsBI_Streaming_CFR' cannot be kicked due to status: InRollout - Rollout in progress..Rollout 'AdsBI-AdsMz-Test-MW1-MSNMediation-VE.6827694_4064931696698155710_0.Rollout_AdsBI_Streaming_CFR' cannot be kicked due to status: OtherRollout - Waiting on other rollout on MF AdsBI_Streaming_CFR in progress: AdsBI_Streaming_CFR.AdvertiserAggs_94d0ef91_90546_1_15-0-897+94d0ef91_CL6670088_5920165827906938206_0.AdsBI-AdsMz-Test-MW1-AdvertiserAggs-VE.csv:AdsBI_Streaming_CFR.AgoraAggs_76700988_85934_1_16-0-173+76700988_CL6670088_4056715873563314682_0.AdsBI-AdsMz-Test-MW1-AgoraAggs-VE.csv:AdsBI_Streaming_CFR.CACFR_9a63868e_90143_2_1-0-2408+9a63868e_CL6670088_11167218336511992544_0.AdsBI-AdsMz-Test-MW1-CACFR-VE.csv:AdsBI_Streaming_CFR.CFR_2d2cea68_91067_1_17-0-6662+2d2cea68_CL6670088_18415503406869780426_0.AdsBI-AdsMz-Test-MW1-CFR-VE.csv:AdsBI_Streaming_CFR.KpiAggs_fdd5f92d_90180_1_17-0-433+fdd5f92d_CL6670088_6626197140932378707_0.AdsBI-AdsMz-Test-MW1-KPIAggs-VE.csv:AdsBI_Streaming_CFR.Monetization_de0db608_91071_1_1-0-2808+de0db608_CL6670088_351842135959690630_0.AdsBI-AdsMz-Test-MW1-Monetization-VE.csv:AdsBI_Streaming_CFR.Orchestration_93eb645e_91078_1_16-0-557+93eb645e_CL6799404_17553440741849145769_0.AdsBI-AdsMz-Test-MW1-MzOrchestration-VE.csv:AdsBI_Streaming_CFR.PublisherAggs_55bd6dd9_90534_1_17-0-474+55bd6dd9_CL6670088_15361642287162420254_0.AdsBI-AdsMz-Test-MW1-PublisherAggs-VE.csv:AdsBI_Streaming_CFR.Mediation_865d3b39_90170_1_merge_20210629_-1_CL6670088_3189803787641037971_0.AdsBI-AdsMz-Test-MW1-Mediation-VE.csv:AdsBI_Streaming_CFR.MSNMediation_4a63e32e_91077_2_0-1-28+4a63e32e33_CL6827694_4798040603905411799_0.AdsBI-AdsMz-Test-MW1-MSNMediation-VE.csv:.OM.Autopilot-AutopilotClient-VE.csv..; .. Some rollouts not triggered.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
Deployment: 6827694_4798040603905411799 (this is NOT the latest deployment.ini change number)
Downloaded 89 files, 21510455 bytes, compressed size: 8267353 bytes.
Failed to build services: Failed to build service 'SAMHourlyS2B.MSNMediation_4a63e32e_91077_2_0-1-28+4a63e32e33': Proxy error: not all data was received (DP: 25.68.114.53; by MW01NAP0000036C); ; ;Log:e:GetStreamChunkMapWithRetry:stream does not exist [response from server] : GetStreamChunkMap('stream://AdsBI-AdsMz-Test-MW1-MSNMediation-VE/app/ServiceMaps/MSNMediationServiceMap.ini'; 'MSNMediation_4a63e32e_91077_2_0-1-28+4a63e32e33') failed ;Log:w:ApDynamicStorage::StorageConnectionEx::DownloadBufferFromStreamWithRetry:Stream (stream://AdsBI-AdsMz-Test-MW1-MSNMediation-VE/app/ServiceMaps/MSNMediationServiceMap.ini _ MSNMediation_4a63e32e_91077_2_0-1-28+4a63e32e33) not found in storage ;
Majority of the building errors are temporary and will be fixed automatically.
If there is no change or progress in the app deployment log after 15 minutes, contact apswat for production environments and aptalk for non-production environment.
Look in App Deployment log for more info

Here are common error messages and possible fixes:
Msg: Error 53: The network path was not found.: ac: 0
Fix: Try adding REDMOND@ to the build path.

Msg: Msg: EDP010385: APSEQREAD::GetDataPointer error 2
Fix: Try adding REDMOND@ to the build path.

Msg: Proxy reported error: EDP010196: ApckBuilder Error 3: The system cannot find the path specified.
Fix: Look for missing directories under your build drop.

For more troubleshooting info, please refer to App Deployment Troubleshooting
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
Deployment: 6827694_9979618793807238839_0 (this is NOT the latest deployment.ini change number)
Downloaded [unknown] files, [unknown] bytes.
Failed to import version '6827694_9979618793807238839' of image for environment 'AdsBI-AdsMz-Test-MW1-MSNMediation-VE': Error downloading image from remote storage (stream 'stream://AdsBI-AdsMz-Test-MW1-MSNMediation-VE/app/image.ini@6827694_9979618793807238839')
Majority of the building errors are temporary and will be fixed automatically.
If there is no change or progress in the app deployment log after 15 minutes, contact apswat for production environments and aptalk for non-production environment.
Look in App Deployment log for more info

Here are common error messages and possible fixes:
Msg: Error 53: The network path was not found.: ac: 0
Fix: Try adding REDMOND@ to the build path.

Msg: Msg: EDP010385: APSEQREAD::GetDataPointer error 2
Fix: Try adding REDMOND@ to the build path.

Msg: Proxy reported error: EDP010196: ApckBuilder Error 3: The system cannot find the path specified.
Fix: Look for missing directories under your build drop.

For more troubleshooting info, please refer to App Deployment Troubleshooting

Severity Code Description Project File Line Suppression State
Error MSB3680 The source file “D:\project\Ads.BI.MSNMediation\private\src\Batch/Autopilot/ExternalConfigs/Microsoft.BingAds.BI.APDeploy.exe.config” does not exist. Microsoft.BI.MSNMediation.HourlyS2B.Drop D:\project\Ads.BI.MSNMediation\private\src\Batch\build\targets\CreateDrop.targets 47

Severity Code Description Project File Line Suppression State
Error MSB3030 Could not copy the file “D:\project\Ads.BI.MSNMediation\private\src\Batch/SAMHourlyS2B/Microsoft.BI.MSNMediation.HourlyS2B.Drop/bin\Debug\net472**” because it was not found. Microsoft.BI.MSNMediation.HourlyS2B.Drop D:\project\Ads.BI.MSNMediation\private\src\Batch\build\targets\CreateDrop.targets 47

c:\Users\jingqicao.nuget\packages\microsoft.bingads.bi.apdeploy\10.5.2

在sdk之外做的nuget package改动,比如删掉某个包

resolved Show files that have been merged but not submitted 跟执行目录无关
retype Reappraise the file type for files on the client
revert Discard changes from an opened file
review List and track changelists (for the review daemon)

sd review 如何只搜索某个人的review

FAREAST\jingqicao

sd opened

sd submit -c

azure cosmos命令