Skip to content

8274112: (fc) Tune FileChannel.transferTo() #5623

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 9 commits into from

Conversation

bplb
Copy link
Member

@bplb bplb commented Sep 22, 2021

Please consider this patch which would improve the performance of
FileChannel.transferTo() on macOS and Windows.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5623/head:pull/5623
$ git checkout pull/5623

Update a local copy of the PR:
$ git checkout pull/5623
$ git pull https://git.openjdk.java.net/jdk pull/5623/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5623

View PR using the GUI difftool:
$ git pr show -t 5623

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5623.diff

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Sep 22, 2021

👋 Welcome back bpb! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot added the rfr Pull request is ready for review label Sep 22, 2021
@bplb
Copy link
Member Author

bplb commented Sep 22, 2021

On Linux, sendfile() was measured to provide faster transfer than either the
trusted channel (mapped buffer copy) or the arbitrary channel (Java read-write)
paths, and an experimental native read-write path. Therefor on Linux no change is
needed.

On macOS, native read-write was measured to be faster than both the trusted and
arbitrary paths. Benchmark results for various lengths are for the trusted path

Benchmark                    (length)   Mode  Cnt       Score      Error  Units
TransferTune.transferToFC        1000  thrpt   10  127670.489 ± 1018.466  ops/s
TransferTune.transferToFC       10000  thrpt   10  103345.306 ± 1049.713  ops/s
TransferTune.transferToFC      100000  thrpt   10   36488.102 ±  242.382  ops/s
TransferTune.transferToFC      500000  thrpt   10    9440.104 ±   56.543  ops/s
TransferTune.transferToFC     1000000  thrpt   10    4813.599 ±   78.812  ops/s
TransferTune.transferToFC    10000000  thrpt   10     184.298 ±    4.617  ops/s
TransferTune.transferToFC   100000000  thrpt   10      23.391 ±    0.183  ops/s
TransferTune.transferToFC   500000000  thrpt   10       4.674 ±    0.028  ops/s
TransferTune.transferToFC  1000000000  thrpt   10       2.349 ±    0.010  ops/s
TransferTune.transferToFC  2000000000  thrpt   10       0.993 ±    0.004  ops/s
TransferTune.transferToFC  4000000000  thrpt   10       0.455 ±    0.002  ops/s

and the proposed native read-write path

Benchmark                    (length)   Mode  Cnt       Score      Error  Units
TransferTune.transferToFC        1000  thrpt   10  253841.238 ± 1884.480  ops/s
TransferTune.transferToFC       10000  thrpt   10  215629.254 ± 1987.146  ops/s
TransferTune.transferToFC      100000  thrpt   10   48655.924 ± 1060.983  ops/s
TransferTune.transferToFC      500000  thrpt   10   11596.874 ±  126.086  ops/s
TransferTune.transferToFC     1000000  thrpt   10    5889.314 ±   20.170  ops/s
TransferTune.transferToFC    10000000  thrpt   10     364.018 ±    2.149  ops/s
TransferTune.transferToFC   100000000  thrpt   10      31.956 ±    0.277  ops/s
TransferTune.transferToFC   500000000  thrpt   10       6.282 ±    0.090  ops/s
TransferTune.transferToFC  1000000000  thrpt   10       3.134 ±    0.103  ops/s
TransferTune.transferToFC  2000000000  thrpt   10       1.542 ±    0.080  ops/s
TransferTune.transferToFC  4000000000  thrpt   10       0.750 ±    0.087  ops/s

On Windows the native read-write path was also measured to be faster than the trusted path except that beyond a certain limit the trusted path becomes faster than native read-write. The value of this limit appears to be approximately 2MB but more verification might be needed.

@openjdk
Copy link

openjdk bot commented Sep 22, 2021

@bplb The following label will be automatically applied to this pull request:

  • nio

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the nio nio-dev@openjdk.org label Sep 22, 2021
@mlbridge
Copy link

mlbridge bot commented Sep 22, 2021

@AlanBateman
Copy link
Contributor

AlanBateman commented Sep 22, 2021

No objection to try to improve the macOS and Windows cases but I'm concerned about the maintainability. It might be that we have to do some significant refactoring of the FileChannel implementation to support these cases. I think the native methods should be as simple as possible and all the decisions on the dispatching should be in the Java code, not the native code.

@bplb
Copy link
Member Author

bplb commented Sep 22, 2021

I had considered adding a fourth path, transferToFileChannel(), which would be after the direct but before the trusted path. This would I think dispense with the UNSUPPORTED_SUBCASE.

@AlanBateman
Copy link
Contributor

I had considered adding a fourth path, transferToFileChannel(), which would be after the direct but before the trusted path. This would I think dispense with the UNSUPPORTED_SUBCASE.

transferToDirectly already tests if the target is a FileChannel so I think a transferToFileChannel would be cleaner. My main concern is that there is many cases and have some in FCI and some in the native code makes it difficult to maintain. I'd prefer to have the native methods be simple "leaf" methods and all decisions are done in the FCI code.

@bplb
Copy link
Member Author

bplb commented Sep 22, 2021

I agree about keeping the decisions at the Java level. I will work on refactoring this.

@bplb
Copy link
Member Author

bplb commented Sep 22, 2021

The second commit refactored FileChannelImpl to use transferToFileChannel() interposed between the direct and trusted paths. The performance measurements are little changed:

Benchmark                    (length)   Mode  Cnt       Score      Error  Units
TransferTune.transferToFC        1000  thrpt   10  287559.646 ± 3136.563  ops/s
TransferTune.transferToFC       10000  thrpt   10  238028.996 ± 3759.595  ops/s
TransferTune.transferToFC      100000  thrpt   10   50048.757 ± 1462.758  ops/s
TransferTune.transferToFC      500000  thrpt   10   11642.157 ±   99.617  ops/s
TransferTune.transferToFC     1000000  thrpt   10    5950.070 ±   54.328  ops/s
TransferTune.transferToFC    10000000  thrpt   10     363.774 ±    3.560  ops/s
TransferTune.transferToFC   100000000  thrpt   10      32.454 ±    0.166  ops/s
TransferTune.transferToFC   500000000  thrpt   10       6.506 ±    0.175  ops/s
TransferTune.transferToFC  1000000000  thrpt   10       3.127 ±    0.121  ops/s
TransferTune.transferToFC  2000000000  thrpt   10       1.555 ±    0.084  ops/s
TransferTune.transferToFC  4000000000  thrpt   10       0.747 ±    0.101  ops/s

the transfer_read_write() functions were not changed as yet to return a positive tw if either the read or the write fails. The UNSUPPORTED_SUBCASE was removed.

@AlanBateman
Copy link
Contributor

Thanks for the updates, this looks much cleaner and maintainable now. Busy just now so will look at it closer later.

@AlanBateman
Copy link
Contributor

Are you planning to update the PR with the changes that we discussed here?

@AlanBateman
Copy link
Contributor

The latest version looks okay but I'm wondering if we can move transfer_read_write out of the native code. As you know, in this area we try to have the JNI methods do a single syscall. With this change we've put a transfer loop into the native code when it should probably be in Java.

@bplb
Copy link
Member Author

bplb commented Nov 1, 2021

I believe that prior to publishing this PR I tested the case of doing the read-write transfer in Java and it did not afford the same sort of performance improvement as what is proposed here. This version was inspired more by the native code underlying the Unix implementation of Files.copy() (UnixCopyFile.c) which has exactly the same sort of native read-write loop in it. The question here then becomes whether the performance improvement is worth the extra native code.

@bridgekeeper
Copy link

bridgekeeper bot commented Nov 30, 2021

@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bplb
Copy link
Member Author

bplb commented Nov 30, 2021

continue ...

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 28, 2021

@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bplb
Copy link
Member Author

bplb commented Jan 3, 2022

continue

@bridgekeeper
Copy link

bridgekeeper bot commented Jan 31, 2022

@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bplb
Copy link
Member Author

bplb commented Jan 31, 2022

continue ...

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 1, 2022

@bplb This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply add a new comment to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

@bridgekeeper
Copy link

bridgekeeper bot commented Mar 29, 2022

@bplb This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

@bridgekeeper bridgekeeper bot closed this Mar 29, 2022
@bplb
Copy link
Member Author

bplb commented Mar 29, 2022

/open

@openjdk openjdk bot reopened this Mar 29, 2022
@openjdk
Copy link

openjdk bot commented Mar 29, 2022

@bplb This pull request is now open

bplb added 2 commits April 7, 2022 15:24
…; use Linux copy_file_range() for direct copy if available
@bplb
Copy link
Member Author

bplb commented Apr 12, 2022

Commit 65e3adc makes the following changes:

  1. removes transferToFileChannel() and its associated logic and read-write native code;
  2. adds a transfer size threshold below which trusted transfer will fall back to arbitrary transfer;
  3. uses copy_file_range(2) instead of sendfile(2) for direct transfer on Linux when possible.

The effects of these changes are:

  • the logic is simplified and the amount of native code changed is greatly reduced;
  • non-direct transfers below the trusted transfer size threshold are faster (verified on Linux, macOS, and Windows);
  • direct transfers larger than 0x7ffff000 (2,147,479,552) bytes to a file channel may be made on Linux if copy_file_range() is available (verified on Linux).

@bplb
Copy link
Member Author

bplb commented Apr 28, 2022

As discussed in JDK-8282039, using copy_file_range(2) on Linux does not appear to offer any advantage over sendfile(2), at least for use cases in the JDK. So this PR has now been winnowed down to just adding a size threshold below which a read-write copy loop will be used instead of memory mapping to perform the transfer.

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got this to a good place and much simpler than the early iterations. So I think the src changes are good. I don't have time right now to do a detailed review of the test, maybe @LanceAndersen might be help on that.

@openjdk
Copy link

openjdk bot commented Apr 29, 2022

@bplb This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8274112: (fc) Tune FileChannel.transferTo()

Reviewed-by: alanb, lancea, rriggs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 57 new commits pushed to the master branch:

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Apr 29, 2022
rand.nextBytes(expected);
for (int i=0; i<size; i++)
if (incoming[i] != expected[i])
throw new RuntimeException("Data corrupted");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps if (!Arrays.equals(incoming, 0, size, expected, 0, size))...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree but I would take it further and create a utility method so that it is reusable as it looks like you could take advantage of it later in the test(around line 175)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps if (!Arrays.equals(incoming, 0, size, expected, 0, size))...

I think here the simpler Arrays.equals(byte[],byte[]) can be used.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree but I would take it further and create a utility method so that it is reusable as it looks like you could take advantage of it later in the test(around line 175)

I think that if line 124 and after uses the simpler Arrays.equals(byte[],byte[]) then such a utility method would be less useful.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by commit c72b4b5.

static void generateBigFile(File file) throws Exception {
private static void transferFileDirectly() throws Exception {
outFile.delete();
FileOutputStream fos = new FileOutputStream(outFile);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can try-with-resources be used here for both fos and out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Fixed by commit c72b4b5.

if (bytesTransferred >= 0)
pos += bytesTransferred;
else
throw new Exception("transfer failed");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including some position information in the exception would help diagnose, especially with intermittent failures, if they happen.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by commit c72b4b5.

I would probably have been tempted to convert this test to leverage TestNG as I think it would streamline the test a bit (or safe this for a future date)

I prefer to leave the TestNG conversion to another issue TBD.

Copy link
Contributor

@LanceAndersen LanceAndersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments.

I would probably have been tempted to convert this test to leverage TestNG as I think it would streamline the test a bit (or safe this for a future date)

static int CHUNK_SIZE = 1024 * 9;
private static final Random RAND = RandomFactory.getRandom();

private static final int FILE_SIZE = 1000*1024;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use a comment regarding the reason the file size is what it is

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't recall why that particular value was chosen except in relation to other constants. Will add some comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by commit c72b4b5.

rand.nextBytes(expected);
for (int i=0; i<size; i++)
if (incoming[i] != expected[i])
throw new RuntimeException("Data corrupted");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree but I would take it further and create a utility method so that it is reusable as it looks like you could take advantage of it later in the test(around line 175)

Copy link
Contributor

@RogerRiggs RogerRiggs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@bplb
Copy link
Member Author

bplb commented May 4, 2022

/integrate

@openjdk
Copy link

openjdk bot commented May 4, 2022

Going to push as commit 17cc713.
Since your change was applied there have been 86 commits pushed to the master branch:

  • 7424f47: 8286114: [test] show real exception in bomb call in sun/rmi/runtime/Log/checkLogging/CheckLogging.java
  • 29c2e54: 8286092: Remove dead windows stack code
  • 4e1e76a: 8278757: [s390] Implement AES Counter Mode Intrinsic
  • 4b2c822: 8282477: [x86, aarch64] vmassert(_last_Java_pc == NULL, "already walkable"); fails with async profiler
  • ca9d039: 8285934: Remove unimplemented MemTracker::init_tracking_level
  • 0462d5a: 8285452: Add a new test library API to replace a file content using FileUtils.java
  • 4282fb2: 8286063: check compiler queue after calling AbstractCompiler::on_empty_queue
  • 075ce8a: 8286069: keytool prints out wrong key algorithm for -importpass command
  • efcd3d3: 8286088: add comment to InstallAsyncExceptionHandshake destructor
  • f82dd76: 8285885: Replay compilation fails with assert(is_valid()) failed: check invoke
  • ... and 76 more: https://git.openjdk.java.net/jdk/compare/1f868f1d091602cc462ee0fe5fa613a3638a5f1c...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label May 4, 2022
@openjdk openjdk bot closed this May 4, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels May 4, 2022
@openjdk
Copy link

openjdk bot commented May 4, 2022

@bplb Pushed as commit 17cc713.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@bplb bplb deleted the FileChannel-transferTo-tune-8274112 branch May 4, 2022 15:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated nio nio-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

None yet

4 participants