Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8275319 java.net.NetworkInterface throws java.lang.Error instead of SocketException #5956

Closed

Conversation

djelinski
Copy link
Member

@djelinski djelinski commented Oct 14, 2021

Per Java documentation, "Error [..] indicates serious problems that a reasonable application should not try to catch". Failure to enumerate network interfaces or addresses is not a serious enough situation; many applications can recover from this pretty easily.

All native methods (except init()) in NetworkInterface are declared with throws SocketException, so throwing SocketExceptions instead of Errors will match the declared interface.

Unix version of NetworkInterface already throws SocketExceptions in similar situations, and does not throw Error under any circumstances.

I searched the bug database and mail archives, but could not find any discussion on the topic; if there's a reason to keep throwing Errors, I couldn't find it.

tier1 tests pass, java.net tests pass. No new regression tests as I couldn't find any list of steps to reproduce the problem.

I don't have write access to JBS. I could use some help with creating a ticket.


Progress

  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue
  • Change must be properly reviewed

Issue

  • JDK-8275319: java.net.NetworkInterface throws java.lang.Error instead of SocketException

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jdk pull/5956/head:pull/5956
$ git checkout pull/5956

Update a local copy of the PR:
$ git checkout pull/5956
$ git pull https://git.openjdk.java.net/jdk pull/5956/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 5956

View PR using the GUI difftool:
$ git pr show -t 5956

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jdk/pull/5956.diff

Sorry, something went wrong.

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 14, 2021

👋 Welcome back djelinski! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 14, 2021

@djelinski The following label will be automatically applied to this pull request:

  • net

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the net net-dev@openjdk.org label Oct 14, 2021
@djelinski djelinski marked this pull request as ready for review October 14, 2021 18:00
@jaikiran
Copy link
Member

Hello @djelinski,

I don't have write access to JBS. I could use some help with creating a ticket.

I have created https://bugs.openjdk.java.net/browse/JDK-8275319 to track this. For this PR to trigger the PR review process, you will now have to update the PR title to match the bug id and bug title (and I think even the commit message should refer to the same).

@djelinski djelinski changed the title Throw SocketExceptions instead of Errors in NetworkInterface 8275319 java.net.NetworkInterface throws java.lang.Error instead of SocketException Oct 15, 2021
@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 15, 2021
@mlbridge
Copy link

mlbridge bot commented Oct 15, 2021

Webrevs

@AlanBateman
Copy link
Contributor

I suspect most of these cases need a closer look as there may be cases that can never happen, meaning Error or InternalError would be more appropriate, e.g. if MultiByteToWideChar fails with ERROR_INVALID_FLAGS or ERROR_INVALID_PARAMETER for example.

@dfuch
Copy link
Member

dfuch commented Oct 15, 2021

I have to agree with Alan on this. Is there any concrete case where an Error was observed, or is this pure speculation?
If there are concrete cases then the proper fix would probably be to examine the error code or condition that triggered the Error - and investigate if the native source code can be updated to take remediation actions in those cases.

@djelinski
Copy link
Member Author

@jaikiran thanks for the JBS. Editing PR title was sufficient.

Is there any concrete case where an Error was observed?

It happens occasionally, as evidenced by JDK-8217298, JDK-8046500 (fixed), JDK-8165665, JDK-8066931, JDK-8057900, JDK-8065559(closed), JDK-8040229, JDK-8065078(fixed), JDK-8068597(fixed), numerous reports on Google (search for "error IP Helper Library"), and an ancient note found in my company's codebase, possibly no longer relevant.

if MultiByteToWideChar fails

I started checking how such failures are handled elsewhere in JDK codebase, and the results are pretty inconsistent: ignore result, normal error handling, native assert, and Java error in NetworkInterface. Some consistency here would be nice, but then, I don't know which option I would pick. Google couldn't find any cases of "Error Cannot get multibyte", so I guess that doesn't happen too often.

Error or InternalError

There are only 2 places in Java codebase where Error is thrown, one is NetworkInterface, the other one is here. JNU_ThrowInternalError is definitely the more popular option.

Still, I'd argue that even if an unhandled error in IP helper library indicates a bug in JDK, throwing a checked exception is a better option here, because:

  • these errors do not indicate that the JVM is broken / unusable, at least not any more than it was before the function was called. They do not affect global JVM state in any way
  • a checked exception is less likely to go unnoticed (uncaught Error will usually terminate one thread, and let other threads continue; checked exception needs to be handled by user code)

@dfuch
Copy link
Member

dfuch commented Oct 15, 2021

It happens occasionally, as evidenced by JDK-8217298, JDK-8046500 (fixed), JDK-8165665, JDK-8066931, JDK-8057900, JDK-8065559(closed), JDK-8040229, JDK-8065078(fixed), JDK-8068597(fixed), numerous reports on Google (search for "error IP Helper Library"), and an ancient note found in my company's codebase, possibly no longer relevant.

So these Errors do actually happen - thanks for the confirmation, and double thanks for the archeology research. This is very useful to this discussion.

I see that some of the issue you mentioned are suggesting some fixes - like - for instance JDK-8165665.
It may be that we need to examine each of these on a case-by-case basis. It seems that GetAdaptersAddresses, for instance, has more error codes than those we specifically handle - and maybe we need more specific error handling for some of these.
Maybe throwing SocketException, as you suggest, is appropriate for some of these cases.
Maybe just skipping the interface would be the right choice for others?

If some conditions are not supposed to happen - then I still think that throwing InternalError is more appropriate.
I would be afraid that blindly changing all Error into SocketException would be hiding issues under the carpet, as they would appear as a regular failure...

@AlanBateman
Copy link
Contributor

It happens occasionally, as evidenced by JDK-8217298, JDK-8046500 (fixed), JDK-8165665, JDK-8066931, JDK-8057900, JDK-8065559(closed), JDK-8040229, JDK-8065078(fixed), JDK-8068597(fixed), numerous reports on Google (search for "error IP Helper Library"), and an ancient note found in my company's codebase, possibly no longer relevant.

There was churn in the Windows APIs and some of these issues may be historical. I think it would be useful if we can find any recent reports of issues and try to identify if they are issues in specific environments/configurations, JDK issues (not handling some cases), or something else. I'd prefer not rush in and change all these exceptions without going through them one by one.

@djelinski
Copy link
Member Author

we need to examine each of these on a case-by-case basis

Makes sense.

  1. MultiByteToWideChar case: we can change this to InternalError, as it's not supposed to happen and apparently not happening
  2. GetIfTable case:
  • as JDK-8165665 states, our handling of ERROR_INSUFFICIENT_BUFFER leaves something to be desired. We could keep allocating larger buffers a few more times, similar to what we have here.
  • ERROR_INVALID_PARAMETER qualifies as InternalError to me,
  • ERROR_NOT_SUPPORTED is either a SocketException or an empty list.
  • I'm a bit concerned about the "Other" catch-all qualifier on MSDN page, but I'd use SocketException for these, possibly with error code / name embedded in exception message. Since they are not explicitly enumerated by Microsoft, I don't see how they could qualify as Java bugs.
  1. GetIpAddrTable case: documentation specifies exactly the same list of errors as above, so we could handle them in the same way
  2. GetAdaptersAddresses case (winxp only - is the file really used only on WinXP?):
  • ERROR_ADDRESS_NOT_ASSOCIATED: either return no data or throw SocketException to make it different from ERROR_NO_DATA
  • ERROR_BUFFER_OVERFLOW: our handling looks fine here
  • ERROR_INVALID_PARAMETER: InternalError
  • ERROR_NOT_ENOUGH_MEMORY: OOME?
  • ERROR_NO_DATA: return no data
  • other: JDK-8217298 reports ERROR_ACCESS_DENIED (5), JDK-8231811 reports ERROR_NOT_FOUND (1168); I'd use SocketException for error codes in this category.

What do you think?

@AlanBateman
Copy link
Contributor

JDK-8217298 hints that one scenario where an exception is thrown is when there are no network adapters configured. I don't know how easy it would be to create that scenario and to see how current JDK releases behave.

MultiByteToWideChar - yes, makes sense.

GetIfTable -yes, I agree this needs to cleaned up. I don't see ERROR_BUFFER_OVERFLOW listed as a possible error.

GetAdaptersAddresses. It would be nice if this NetworkInterface_winXP.c could be removed but it is still used for the -Djava.net.preferIPv4Stack=true case. A more useful exercise may be to see if we could move to one implementation that works with preferIPv4Stack set to true or false.

@djelinski
Copy link
Member Author

djelinski commented Oct 18, 2021

Updated the PR with more specific exception types. I'm still doing some printf-based testing / debugging. Observations so far:

  • My assumption that WinXP file contained some ancient code was incorrect. The WinXP functions are used when NOT in java.net.preferIPv4Stack mode, and are the preferred / more modern way. Cost me a few sanity points when I first realized.
  • GetAdaptersAddresses failures are generating correct exception types, but error message generated by JNU_ThrowByNameWithMessageAndLastError is not what I expected. Example:
java.net.SocketException: An address has not yet been associated with the network endpoint (IP Helper Library GetAdaptersAddresses function failed)

I expected the Windows error code to appear after my message, not before.

  • GetIpAddrTable failures are working correctly in java.net.preferIPv4Stack mode; in default mode they crash the JVM (access violation in net.dll). Apparently some of the callers attempt to access result pointer even when lookupIPAddrTable fails.

I'll update when I know more.

FWIW, my test code:

		try {
			Collections.list(NetworkInterface.getNetworkInterfaces());
			System.out.println("getIfs succeeded");
		} catch (Throwable t) {
			t.printStackTrace();
		}

@djelinski
Copy link
Member Author

The crash reason was a bit more mundane. We assigned the (int) result of lookupIPAddrTable to a (DWORD / unsigned int) variable, then checked if the variable value is negative (which it cannot be).
Wonder why the compiler didn't warn about this.

@djelinski
Copy link
Member Author

GetIfTable failures are handled somewhat correctly; in IPv4 mode the exception is propagated as expected.

In default mode the exception is swallowed here, and then SocketException ("No network interfaces configured") is thrown in Java code.
The code that swallows exception appears incorrect; when exception is swallowed, netifPP is null, last is null and the code that could add an IPv6 interface would crash.

Should I clean that up here or leave it for future PRs?

case ERROR_NO_DATA:
// not an error
*adapters = NULL;
return ERROR_SUCCESS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mapping ERROR_NO_DATA to ERROR_SUCCESS is probably correct here. Could this explain bug reports that seem to be from configurations with no IP addresses plumbed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it could. But in order to get ERROR_NO_DATA here one would need to disable loopback interface. I couldn't find how to do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has been one or two reports from configurations that appear to have no interface or no IP addresses. I don't think we've been able to create those configurations to duplicate. In any case, treat NO_DATA as SUCCESS seems correct here.

@@ -237,7 +229,7 @@ static int ipinflen = 2048;
*/
int getAllInterfacesAndAddresses (JNIEnv *env, netif **netifPP)
{
DWORD ret, flags;
int ret, flags;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks right but I can't relate this to your comment about the crash.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me try to explain this clang-style:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, I hadn't spotted the broken code that tests if loopupIPAddrTable fails.

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you've got this to a good place. It would be helpful if you could trim down some of the really long lines to keep to consistent with the existing code if possible. It's clear that there is a lot of cleanup needed in this area but that can be done in future PRs as needed.

@dfuch
Copy link
Member

dfuch commented Oct 19, 2021

Hi Daniel, thanks for persisting with this. I had a look at your changes and I believe you have hit the right balance. I also like the fact plain java.lang.Error has been transformed into java.lang.InternalError. Also thanks for sanitizing the error code checks to match what GetIfTable and GetAdaptersAddresses are documented to return. I believe we are ending in a much better place. Please give me some time to import your patch and give it a round of testing.

@dfuch
Copy link
Member

dfuch commented Oct 20, 2021

Hi - I have run your current patch through our CI and the good news is that we got no failure.
However - we should probably go the extra mile and fix the case where -2 is returned.

In default mode the exception is swallowed here, and then SocketException ("No network interfaces configured") is thrown in Java code.
The code that swallows exception appears incorrect; when exception is swallowed, netifPP is null, last is null and the code that could add an IPv6 interface would crash.

I believe you are correct and there are still some issues in that case. In particular I also see -that if netaddrCount == -2 there:


we're going to try and allocate arrays of size -2 a few lines below. That doesn't sound right.

@dfuch
Copy link
Member

dfuch commented Oct 20, 2021

Hmm... JDK-8225239 might have introduced a bug.
I see that with the current code neither enumAddresses_win nor enumAddresses_win_ipaddrtable will return -2 - so there appears to be some dead code. I suspect that this line here is wrong:


it should be returning ret not NULL

Copy link
Contributor

@AlanBateman AlanBateman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current patch looks good and it's good to have caught the DWORD->int issue. We should add information about this to the JBS issue.
I don't mind if you want to iterate further with Daniel's suggestion, okay if it's a separate PR too. In general I think there is quite a bit of cleanup to do in this code.

@openjdk
Copy link

openjdk bot commented Oct 20, 2021

⚠️ @djelinski the full name on your profile does not match the author name in this pull requests' HEAD commit. If this pull request gets integrated then the author name from this pull requests' HEAD commit will be used for the resulting commit. If you wish to push a new commit with a different author name, then please run the following commands in a local repository of your personal fork:

$ git checkout networkinterface-socketexception
$ git commit -c user.name='Preferred Full Name' --allow-empty -m 'Update full name'
$ git push

@openjdk
Copy link

openjdk bot commented Oct 20, 2021

@djelinski This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8275319: java.net.NetworkInterface throws java.lang.Error instead of SocketException

Reviewed-by: alanb, dfuchs

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 108 new commits pushed to the master branch:

  • a91a0a5: 8233724: Remove -Wc++14-compat warning suppression in operator_new.cpp
  • 1271fbf: 8248584: Enable CHECK_UNHANDLED_OOPS for Windows fastdebug builds
  • 135cf3c: 8275439: Remove PrintVtableStats
  • 50a5723: 8274910: Compile in G1 evacuation failure injection code based on define
  • 5454a76: 8275273: Add missing HtmlStyle documentation
  • bd0bed7: 8273317: crash in cmovP_cmpP_zero_zeroNode::bottom_type()
  • 77b2789: 7124287: [macosx] JTableHeader doesn't get focus after pressing F8 key
  • c24fb85: 8275512: Upgrade required version of jtreg to 6.1
  • 926966b: 8275003: Suppress warnings on non-serializable non-transient instance fields in windows mscapi
  • e63c148: 8264849: Add KW and KWP support to PKCS11 provider
  • ... and 98 more: https://git.openjdk.java.net/jdk/compare/8657f77608f37d7ff5254032858f2f16c7c204d5...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@AlanBateman, @dfuch) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 20, 2021
@djelinski
Copy link
Member Author

Thanks @dfuch and @AlanBateman for the review and all your help.

@dfuch I'd rather follow up on the remaining issues in a separate PR. This one is already doing more than the PR title advertises.

@djelinski
Copy link
Member Author

/integrate

@openjdk openjdk bot added the sponsor Pull request is ready to be sponsored label Oct 20, 2021
@openjdk
Copy link

openjdk bot commented Oct 20, 2021

@djelinski
Your change (at version 9e9ea0a) is now ready to be sponsored by a Committer.

@dfuch
Copy link
Member

dfuch commented Oct 20, 2021

@djelinski I have no objection in handling that in a followup PR.
I have logged https://bugs.openjdk.java.net/browse/JDK-8275640.
Would you be willing to continue with that?

@dfuch
Copy link
Member

dfuch commented Oct 20, 2021

/sponsor

@openjdk
Copy link

openjdk bot commented Oct 20, 2021

Going to push as commit 043cde2.
Since your change was applied there have been 108 commits pushed to the master branch:

  • a91a0a5: 8233724: Remove -Wc++14-compat warning suppression in operator_new.cpp
  • 1271fbf: 8248584: Enable CHECK_UNHANDLED_OOPS for Windows fastdebug builds
  • 135cf3c: 8275439: Remove PrintVtableStats
  • 50a5723: 8274910: Compile in G1 evacuation failure injection code based on define
  • 5454a76: 8275273: Add missing HtmlStyle documentation
  • bd0bed7: 8273317: crash in cmovP_cmpP_zero_zeroNode::bottom_type()
  • 77b2789: 7124287: [macosx] JTableHeader doesn't get focus after pressing F8 key
  • c24fb85: 8275512: Upgrade required version of jtreg to 6.1
  • 926966b: 8275003: Suppress warnings on non-serializable non-transient instance fields in windows mscapi
  • e63c148: 8264849: Add KW and KWP support to PKCS11 provider
  • ... and 98 more: https://git.openjdk.java.net/jdk/compare/8657f77608f37d7ff5254032858f2f16c7c204d5...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot closed this Oct 20, 2021
@openjdk openjdk bot added integrated Pull request has been integrated and removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Oct 20, 2021
@openjdk
Copy link

openjdk bot commented Oct 20, 2021

@dfuch @djelinski Pushed as commit 043cde2.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@djelinski
Copy link
Member Author

@dfuch yessir. I'll continue the cleanup, should have something for review later this week.

@djelinski djelinski deleted the networkinterface-socketexception branch January 12, 2022 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integrated Pull request has been integrated net net-dev@openjdk.org
Development

Successfully merging this pull request may close these issues.

None yet

4 participants