Skip to content

Implemented _getFileSystemRepresentation for Windows #2366

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 24, 2020

Conversation

gmittert
Copy link
Contributor

Modified the helper _getFileSystemRepresentation functions to return a
UTF16 Windows style path.

This makes URL.path return a RFC 1808 path on Windows as opposed to the C:\... path before. Since .path would no longer be valid to pass to win32 functions, it adds Windows specific file system standardization for standardizingPath and getFileSystemRepresentation. Paths should now be represented internally as RFC 1808 paths on Windows, and only converted when being passed to win32 functions.

@gmittert
Copy link
Contributor Author

See also: #2367

cc @compnerd @lxbndr @millenomi @spevans

@gmittert gmittert force-pushed the WinFSR branch 2 times, most recently from abcba72 to 98f1251 Compare June 21, 2019 17:43
@gmittert
Copy link
Contributor Author

  • Now uses NativeFSRCharType
  • UNC paths should now properly be converted
  • Made sure things are internal instead of public/open

@gmittert
Copy link
Contributor Author

@swift-ci please test linux

@millenomi
Copy link
Contributor

I don't want to bifurcate the API. Also: a buffer of WCHARS can be expressed as a buffer of UInt8s with a certain length requirement, so I don't see why bifurcating types is necessary at all.

@gmittert
Copy link
Contributor Author

gmittert commented Jul 9, 2019

I don't want to bifurcate the API. Also: a buffer of WCHARS can be expressed as a buffer of UInt8s with a certain length requirement, so I don't see why bifurcating types is necessary at all.

I'm pretty uncomfortable with storing UTF16 encoded strings in a UInt8 buffer. I think it's asking to get a lot of invisible unicode errors on Windows. At the minimum, calling _fileSystemRepresentation then having to rebind the UnsafePointer<UInt8> you're passed to a UnsafePointer<WCHAR> is confusing would probably involve calling strlen on the buffer each time.

This also doesn't change the public api, only internal. The public getFileSystemRepresentation(:maxLength) still returns a UTF8 encoded cstring, even on Windows.

@gmittert
Copy link
Contributor Author

@millenomi What would you want to see api wise for Windows File System Representations? Shall I just implement the current UInt8 one to return UTF16 data on Windows?

@millenomi
Copy link
Contributor

millenomi commented Jul 22, 2019

Correct. I understand the wish for strong typing, but this isn't source-compatible across platforms, and as such it is in direct opposition to a project goal.

@gmittert gmittert force-pushed the WinFSR branch 2 times, most recently from adea383 to ddc698e Compare July 25, 2019 19:09
@gmittert
Copy link
Contributor Author

gmittert commented Jul 25, 2019

Alright! Finally cleaned this up a bit. It shouldn't touch any public apis, instead modifying the signatures of the internal _fileSystemRepresentation functions. The public getFileSystemRepresentation(:maxLength) still returns a UTF8 encoded cstring, even on Windows but should now actually be a usable path.

@millenomi
Copy link
Contributor

@swift-ci please test linux

@gmittert
Copy link
Contributor Author

@swift-ci please test linux

// directory on the `C:` drive.
while fsr.count > 1
&& (fsr[fsr.index(before: fsr.endIndex)] == "\\")
&& !(fsr.count == 3 && fsr[fsr.index(fsr.endIndex, offsetBy: -2)] == ":") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish there was a more elegant way to write this.

@compnerd
Copy link
Member

LGTM if @millenomi is okay with the restructuring for the getFSR

@gmittert
Copy link
Contributor Author

gmittert commented Aug 14, 2019

To confirm, this PR will mean that that URL(fileURLWithPath: "C:/foo\bar").path returns C:/foo/bar on Windows. This is still considered an absolute url, though it's an extension to the spec

RFC 8089 E.2. DOS and Windows Drive Letters

On Windows- or DOS-like file systems, an absolute file path can begin
with a drive letter. To facilitate this, the "local-path" rule in
Section 2 can be replaced with the following:

 local-path     = [ drive-letter ] path-absolute
 drive-letter   = ALPHA ":"

The "ALPHA" rule is defined in [RFC5234].

This is intended to support the minimal representation of a local
file in a DOS- or Windows-like environment, with no authority field
and an absolute path that begins with a drive letter. For example:

o "file:c:/path/to/file"

URIs of the form "file:///c:/path/to/file" are already supported by
the "path-absolute" rule.

Note that comparison of drive letters in DOS or Windows file paths is
case insensitive. In some usages of file URIs, drive letters are
canonicalized by converting them to uppercase; other usages treat
URIs that differ only in the case of the drive letter as identical.

Modified the helper _getFileSystemRepresentation functions to return a
UTF16 Windows style path.
@gmittert
Copy link
Contributor Author

gmittert commented Jan 7, 2020

Accidentally closed this while keeping it up to date.

@gmittert gmittert reopened this Jan 7, 2020
@gmittert
Copy link
Contributor Author

gmittert commented Jan 7, 2020

@swift-ci please test linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants