Paper/Spigot-Server-Patches/0448-Fix-Chunk-Post-Processing-deadlock-risk.patch

61 lines
3.3 KiB
Diff
Raw Normal View History

From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Aikar <aikar@aikar.co>
Date: Sat, 18 Apr 2020 04:36:11 -0400
Subject: [PATCH] Fix Chunk Post Processing deadlock risk
See: https://gist.github.com/aikar/dd22bbd2a3d78a2fd3d92e95e9f28dc6
as part of post processing a chunk, we can call ChunkConverter.
ChunkConverter then kicks off major physics updates, and when blocks
that have connections across chunk boundries occur, a recursive risk
can occur where A updates a block that triggers a physics request.
That physics request may trigger a chunk request, that then enqueues
a task into the Mailbox ChunkTaskQueueSorter.
If anything requests that same chunk that is in the middle of conversion,
it's mailbox queue is going to be held up, so the subsequent chunk request
will be unable to proceed.
We delay post processing of Chunk.A() 1 "pass" by re stuffing it back into
the executor so that the mailbox ChunkQueue is now considered empty.
This successfully fixed a reoccurring and highly reproduceable crash
for heightmaps.
diff --git a/src/main/java/net/minecraft/server/ChunkProviderServer.java b/src/main/java/net/minecraft/server/ChunkProviderServer.java
index cc91266704043b46653bdd512637d001677ddc76..47a02737faa2eddca1b1f9063fe46afc91425169 100644
--- a/src/main/java/net/minecraft/server/ChunkProviderServer.java
+++ b/src/main/java/net/minecraft/server/ChunkProviderServer.java
@@ -995,6 +995,7 @@ public class ChunkProviderServer extends IChunkProvider {
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 05:47:29 +02:00
return super.executeNext() || execChunkTask; // Paper
}
} finally {
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 05:47:29 +02:00
+ playerChunkMap.chunkLoadConversionCallbackExecutor.run(); // Paper - Add chunk load conversion callback executor to prevent deadlock due to recursion in the chunk task queue sorter
playerChunkMap.callbackExecutor.run();
Improve mid tick chunk loading, Fix Oversleep, other improvements Process loads outside of any canSleep check. Original intent was to only apply those restrictions to generations but realized I had some checks higher up the call chain. Reworked the back off strategy to just run every 1 millisecond per world, and to apply the per tick limit to generations only. This guarantees that your chunk will load with at most around 1ms delay. Additionally, fire midTick processing in a few more places, notably the oversleep section so we can keep processing loads here too which has a large up to 50ms window... Speaking of oversleep, we had a bug in our implementation changes for Timings that caused oversleep to not sleep the correct amount. Because we now moved it into the NEXT tick instead of THIS tick, the value of nextTick had already been increased to +50ms, resulting in the risk of sleeping more than it should, but, more importantly, this caused every task that was trying to NOT run during oversleep to actually run during oversleep. This is now fixed. Another small tweak is to the /tps command, to no longer show the star when TPS is right at 20. Due to ineffeciencies in the sleep precision, TPS is commonly 20.02. This causes the star to show up almost constantly, so now only show it if we actually hit a real "catchup". This commit also improves the changes to the CallbackExecutor, in that it now is also recursion safe. It was possible that the executor could run tasks out of desired order if the executor task scheduled more executor tasks. We solve this by ensuring new additions do not enter the currently iterated queue. Each depth level will have its own queue. Fixes #3220
2020-04-26 05:47:29 +02:00
}
// CraftBukkit end
diff --git a/src/main/java/net/minecraft/server/PlayerChunkMap.java b/src/main/java/net/minecraft/server/PlayerChunkMap.java
index 16e4acdb0f834883a480829a864ef7570035bc26..842f5ebad2a4d040b9912ec4841de426667cd91d 100644
--- a/src/main/java/net/minecraft/server/PlayerChunkMap.java
+++ b/src/main/java/net/minecraft/server/PlayerChunkMap.java
@@ -134,6 +134,8 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
};
// CraftBukkit end
+ final CallbackExecutor chunkLoadConversionCallbackExecutor = new CallbackExecutor(); // Paper
+
// Paper start - distance maps
private final com.destroystokyo.paper.util.misc.PooledLinkedHashSets<EntityPlayer> pooledLinkedPlayerHashSets = new com.destroystokyo.paper.util.misc.PooledLinkedHashSets<>();
@@ -999,7 +1001,7 @@ public class PlayerChunkMap extends IChunkLoader implements PlayerChunk.d {
return Either.left(chunk);
});
}, (runnable) -> {
- this.mailboxMain.a(ChunkTaskQueueSorter.a(playerchunk, runnable));
+ this.mailboxMain.a(ChunkTaskQueueSorter.a(playerchunk, () -> PlayerChunkMap.this.chunkLoadConversionCallbackExecutor.execute(runnable))); // Paper - delay running Chunk post processing until outside of the sorter to prevent a deadlock scenario when post processing causes another chunk request.
});
completablefuture1.thenAcceptAsync((either) -> {