Paper/patches/server/0132-Properly-handle-async-calls-to-restart-the-server.patch

291 lines
12 KiB
Diff
Raw Normal View History

2021-06-11 14:02:28 +02:00
From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
From: Zach Brown <zach.brown@destroystokyo.com>
Date: Fri, 12 May 2017 23:34:11 -0500
Subject: [PATCH] Properly handle async calls to restart the server
The watchdog thread calls the server restart function asynchronously. Prior to
this change, it attempted to do several non-safe operations from the watchdog
thread, rather than the main. Specifically, because of a separate upstream change,
it causes player entities to be ticked asynchronously, among other things.
This is dangerous.
This patch moves the old handling into a synchronous variant, for calls from the
restart command, and adds separate handling for async calls, such as those from
the watchdog thread.
When calling from the watchdog thread, we cannot assume the main thread is in a
tickable state; it may be completely deadlocked. In order to handle this, we mark
the server as stopping, in order to account for situations where the server should
complete a tick reasonbly soon, i.e. 99% of cases.
Should the server not enter a state where it is stopping within 10 seconds, We
will assume that the server has in fact deadlocked and will proceed to force
kill the server.
This modification does not force restart the server should we actually enter a
deadlocked state where the server is stopping, whereas this will in most cases
exit within a reasonable amount of time, to put a fixed limit on a process that
will have plugins and worlds saving to the disk has a high potential to result
in corruption/dataloss.
diff --git a/src/main/java/net/minecraft/server/MinecraftServer.java b/src/main/java/net/minecraft/server/MinecraftServer.java
Rewrite chunk system (#8177) Patch documentation to come Issues with the old system that are fixed now: - World generation does not scale with cpu cores effectively. - Relies on the main thread for scheduling and maintaining chunk state, dropping chunk load/generate rates at lower tps. - Unreliable prioritisation of chunk gen/load calls that block the main thread. - Shutdown logic is utterly unreliable, as it has to wait for all chunks to unload - is it guaranteed that the chunk system is in a state on shutdown that it can reliably do this? Watchdog shutdown also typically failed due to thread checks, which is now resolved. - Saving of data is not unified (i.e can save chunk data without saving entity data, poses problems for desync if shutdown is really abnormal. - Entities are not loaded with chunks. This caused quite a bit of headache for Chunk#getEntities API, but now the new chunk system loads entities with chunks so that they are ready whenever the chunk loads in. Effectively brings the behavior back to 1.16 era, but still storing entities in their own separate regionfiles. The above list is not complete. The patch documentation will complete it. New chunk system hard relies on starlight and dataconverter, and most importantly the new concurrent utilities in ConcurrentUtil. Some of the old async chunk i/o interface (i.e the old file io thread reroutes _some_ calls to the new file io thread) is kept for plugin compat reasons. It will be removed in the next major version of minecraft. The old legacy chunk system patches have been moved to the removed folder in case we need them again.
2022-09-26 10:02:51 +02:00
index f1d4a7a9e74adc18e18b2df960794ec8c05ce340..37a3c1bd60dbd0e0069120d4f48a17cfbc82dca1 100644
2021-06-11 14:02:28 +02:00
--- a/src/main/java/net/minecraft/server/MinecraftServer.java
+++ b/src/main/java/net/minecraft/server/MinecraftServer.java
2022-07-27 21:49:24 +02:00
@@ -221,6 +221,7 @@ public abstract class MinecraftServer extends ReentrantBlockableEventLoop<TickTa
private Map<ResourceKey<Level>, ServerLevel> levels;
2021-06-11 14:02:28 +02:00
private PlayerList playerList;
private volatile boolean running;
+ private volatile boolean isRestarting = false; // Paper - flag to signify we're attempting to restart
private boolean stopped;
private int tickCount;
protected final Proxy proxy;
Rewrite chunk system (#8177) Patch documentation to come Issues with the old system that are fixed now: - World generation does not scale with cpu cores effectively. - Relies on the main thread for scheduling and maintaining chunk state, dropping chunk load/generate rates at lower tps. - Unreliable prioritisation of chunk gen/load calls that block the main thread. - Shutdown logic is utterly unreliable, as it has to wait for all chunks to unload - is it guaranteed that the chunk system is in a state on shutdown that it can reliably do this? Watchdog shutdown also typically failed due to thread checks, which is now resolved. - Saving of data is not unified (i.e can save chunk data without saving entity data, poses problems for desync if shutdown is really abnormal. - Entities are not loaded with chunks. This caused quite a bit of headache for Chunk#getEntities API, but now the new chunk system loads entities with chunks so that they are ready whenever the chunk loads in. Effectively brings the behavior back to 1.16 era, but still storing entities in their own separate regionfiles. The above list is not complete. The patch documentation will complete it. New chunk system hard relies on starlight and dataconverter, and most importantly the new concurrent utilities in ConcurrentUtil. Some of the old async chunk i/o interface (i.e the old file io thread reroutes _some_ calls to the new file io thread) is kept for plugin compat reasons. It will be removed in the next major version of minecraft. The old legacy chunk system patches have been moved to the removed folder in case we need them again.
2022-09-26 10:02:51 +02:00
@@ -894,7 +895,7 @@ public abstract class MinecraftServer extends ReentrantBlockableEventLoop<TickTa
2021-06-11 14:02:28 +02:00
if (this.playerList != null) {
MinecraftServer.LOGGER.info("Saving players");
this.playerList.saveAll();
- this.playerList.removeAll();
+ this.playerList.removeAll(this.isRestarting); // Paper
2021-06-11 14:02:28 +02:00
try { Thread.sleep(100); } catch (InterruptedException ex) {} // CraftBukkit - SPIGOT-625 - give server at least a chance to send packets
}
Rewrite chunk system (#8177) Patch documentation to come Issues with the old system that are fixed now: - World generation does not scale with cpu cores effectively. - Relies on the main thread for scheduling and maintaining chunk state, dropping chunk load/generate rates at lower tps. - Unreliable prioritisation of chunk gen/load calls that block the main thread. - Shutdown logic is utterly unreliable, as it has to wait for all chunks to unload - is it guaranteed that the chunk system is in a state on shutdown that it can reliably do this? Watchdog shutdown also typically failed due to thread checks, which is now resolved. - Saving of data is not unified (i.e can save chunk data without saving entity data, poses problems for desync if shutdown is really abnormal. - Entities are not loaded with chunks. This caused quite a bit of headache for Chunk#getEntities API, but now the new chunk system loads entities with chunks so that they are ready whenever the chunk loads in. Effectively brings the behavior back to 1.16 era, but still storing entities in their own separate regionfiles. The above list is not complete. The patch documentation will complete it. New chunk system hard relies on starlight and dataconverter, and most importantly the new concurrent utilities in ConcurrentUtil. Some of the old async chunk i/o interface (i.e the old file io thread reroutes _some_ calls to the new file io thread) is kept for plugin compat reasons. It will be removed in the next major version of minecraft. The old legacy chunk system patches have been moved to the removed folder in case we need them again.
2022-09-26 10:02:51 +02:00
@@ -945,6 +946,12 @@ public abstract class MinecraftServer extends ReentrantBlockableEventLoop<TickTa
2021-06-11 14:02:28 +02:00
}
public void halt(boolean flag) {
+ // Paper start - allow passing of the intent to restart
2021-06-11 14:02:28 +02:00
+ this.safeShutdown(flag, false);
+ }
+ public void safeShutdown(boolean flag, boolean isRestarting) {
+ this.isRestarting = isRestarting;
+ // Paper end
this.running = false;
2021-06-11 14:02:28 +02:00
if (flag) {
try {
diff --git a/src/main/java/net/minecraft/server/players/PlayerList.java b/src/main/java/net/minecraft/server/players/PlayerList.java
Rewrite chunk system (#8177) Patch documentation to come Issues with the old system that are fixed now: - World generation does not scale with cpu cores effectively. - Relies on the main thread for scheduling and maintaining chunk state, dropping chunk load/generate rates at lower tps. - Unreliable prioritisation of chunk gen/load calls that block the main thread. - Shutdown logic is utterly unreliable, as it has to wait for all chunks to unload - is it guaranteed that the chunk system is in a state on shutdown that it can reliably do this? Watchdog shutdown also typically failed due to thread checks, which is now resolved. - Saving of data is not unified (i.e can save chunk data without saving entity data, poses problems for desync if shutdown is really abnormal. - Entities are not loaded with chunks. This caused quite a bit of headache for Chunk#getEntities API, but now the new chunk system loads entities with chunks so that they are ready whenever the chunk loads in. Effectively brings the behavior back to 1.16 era, but still storing entities in their own separate regionfiles. The above list is not complete. The patch documentation will complete it. New chunk system hard relies on starlight and dataconverter, and most importantly the new concurrent utilities in ConcurrentUtil. Some of the old async chunk i/o interface (i.e the old file io thread reroutes _some_ calls to the new file io thread) is kept for plugin compat reasons. It will be removed in the next major version of minecraft. The old legacy chunk system patches have been moved to the removed folder in case we need them again.
2022-09-26 10:02:51 +02:00
index acb3d75f0777eab5aa117679d2328c22f46cf823..5820dbb6d2129e2f99207e39c3ed9e661610f491 100644
2021-06-11 14:02:28 +02:00
--- a/src/main/java/net/minecraft/server/players/PlayerList.java
+++ b/src/main/java/net/minecraft/server/players/PlayerList.java
@@ -1153,8 +1153,15 @@ public abstract class PlayerList {
2021-06-11 14:02:28 +02:00
}
public void removeAll() {
+ // Paper start - Extract method to allow for restarting flag
+ this.removeAll(false);
2021-06-11 14:02:28 +02:00
+ }
+
+ public void removeAll(boolean isRestarting) {
+ // Paper end
2021-06-11 14:02:28 +02:00
// CraftBukkit start - disconnect safely
for (ServerPlayer player : this.players) {
+ if (isRestarting) player.connection.disconnect(org.spigotmc.SpigotConfig.restartMessage); else // Paper
player.connection.disconnect(this.server.server.shutdownMessage()); // CraftBukkit - add custom shutdown message // Paper - Adventure
}
// CraftBukkit end
diff --git a/src/main/java/org/spigotmc/RestartCommand.java b/src/main/java/org/spigotmc/RestartCommand.java
index 94d8ba376cd1f024b244654cac9bb62bb19e3060..a142a56a920e153ed84c08cece993f10d76f7793 100644
2021-06-11 14:02:28 +02:00
--- a/src/main/java/org/spigotmc/RestartCommand.java
+++ b/src/main/java/org/spigotmc/RestartCommand.java
@@ -46,86 +46,134 @@ public class RestartCommand extends Command
org.spigotmc.AsyncCatcher.shuttingDown = true; // Paper
try
{
- String[] split = restartScript.split( " " );
- if ( split.length > 0 && new File( split[0] ).isFile() )
+ // Paper - extract method and cleanup
+ boolean isRestarting = addShutdownHook( restartScript );
+ if ( isRestarting )
{
- System.out.println( "Attempting to restart with " + restartScript );
+ System.out.println( "Attempting to restart with " + SpigotConfig.restartScript );
+ } else
+ {
+ System.out.println( "Startup script '" + SpigotConfig.restartScript + "' does not exist! Stopping server." );
+ }
+ // Stop the watchdog
+ WatchdogThread.doStop();
- // Disable Watchdog
- WatchdogThread.doStop();
+ shutdownServer( isRestarting );
+ // Paper end
+ } catch ( Exception ex )
+ {
+ ex.printStackTrace();
+ }
+ }
- // Kick all players
- for ( ServerPlayer p : (List<ServerPlayer>) MinecraftServer.getServer().getPlayerList().players )
- {
- p.connection.disconnect(SpigotConfig.restartMessage);
- }
- // Give the socket a chance to send the packets
- try
- {
- Thread.sleep( 100 );
- } catch ( InterruptedException ex )
- {
- }
- // Close the socket so we can rebind with the new process
- MinecraftServer.getServer().getConnection().stop();
+ // Paper start - sync copied from above with minor changes, async added
+ private static void shutdownServer(boolean isRestarting)
+ {
+ if ( MinecraftServer.getServer().isSameThread() )
+ {
+ // Kick all players
+ for ( ServerPlayer p : com.google.common.collect.ImmutableList.copyOf( MinecraftServer.getServer().getPlayerList().players ) )
+ {
+ p.connection.disconnect(SpigotConfig.restartMessage);
+ }
+ // Give the socket a chance to send the packets
+ try
+ {
+ Thread.sleep( 100 );
+ } catch ( InterruptedException ex )
+ {
+ }
- // Give time for it to kick in
- try
- {
- Thread.sleep( 100 );
- } catch ( InterruptedException ex )
- {
- }
+ closeSocket();
- // Actually shutdown
- try
- {
- MinecraftServer.getServer().close();
- } catch ( Throwable t )
- {
- }
+ // Actually shutdown
+ try
+ {
+ MinecraftServer.getServer().close(); // calls stop()
+ } catch ( Throwable t )
+ {
+ }
+
+ // Actually stop the JVM
+ System.exit( 0 );
- // This will be done AFTER the server has completely halted
- Thread shutdownHook = new Thread()
+ } else
+ {
+ // Mark the server to shutdown at the end of the tick
+ MinecraftServer.getServer().safeShutdown( false, isRestarting );
+
+ // wait 10 seconds to see if we're actually going to try shutdown
+ try
+ {
+ Thread.sleep( 10000 );
+ }
+ catch (InterruptedException ignored)
+ {
+ }
+
+ // Check if we've actually hit a state where the server is going to safely shutdown
+ // if we have, let the server stop as usual
+ if (MinecraftServer.getServer().isStopped()) return;
+
+ // If the server hasn't stopped by now, assume worse case and kill
+ closeSocket();
+ System.exit( 0 );
+ }
+ }
+ // Paper end
+
+ // Paper - Split from moved code
+ private static void closeSocket()
+ {
+ // Close the socket so we can rebind with the new process
+ MinecraftServer.getServer().getConnection().stop();
+
+ // Give time for it to kick in
+ try
+ {
+ Thread.sleep( 100 );
+ } catch ( InterruptedException ex )
+ {
+ }
+ }
+ // Paper end
+
+ // Paper start - copied from above and modified to return if the hook registered
+ private static boolean addShutdownHook(String restartScript)
+ {
+ String[] split = restartScript.split( " " );
+ if ( split.length > 0 && new File( split[0] ).isFile() )
+ {
+ Thread shutdownHook = new Thread()
+ {
+ @Override
+ public void run()
{
- @Override
- public void run()
+ try
{
- try
+ String os = System.getProperty( "os.name" ).toLowerCase(java.util.Locale.ENGLISH);
+ if ( os.contains( "win" ) )
{
- String os = System.getProperty( "os.name" ).toLowerCase(java.util.Locale.ENGLISH);
- if ( os.contains( "win" ) )
- {
- Runtime.getRuntime().exec( "cmd /c start " + restartScript );
- } else
- {
- Runtime.getRuntime().exec( "sh " + restartScript );
- }
- } catch ( Exception e )
+ Runtime.getRuntime().exec( "cmd /c start " + restartScript );
+ } else
{
- e.printStackTrace();
+ Runtime.getRuntime().exec( "sh " + restartScript );
}
+ } catch ( Exception e )
+ {
+ e.printStackTrace();
}
- };
-
- shutdownHook.setDaemon( true );
- Runtime.getRuntime().addShutdownHook( shutdownHook );
- } else
- {
- System.out.println( "Startup script '" + SpigotConfig.restartScript + "' does not exist! Stopping server." );
-
- // Actually shutdown
- try
- {
- MinecraftServer.getServer().close();
- } catch ( Throwable t )
- {
}
- }
- System.exit( 0 );
- } catch ( Exception ex )
+ };
+
+ shutdownHook.setDaemon( true );
+ Runtime.getRuntime().addShutdownHook( shutdownHook );
+ return true;
+ } else
{
- ex.printStackTrace();
+ return false;
}
}
+ // Paper end
+
}