Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor loading and archiving or scripts post #1059 #1089

Closed
mstoykov opened this issue Jul 24, 2019 · 6 comments
Closed

Refactor loading and archiving or scripts post #1059 #1089

mstoykov opened this issue Jul 24, 2019 · 6 comments
Labels
enhancement evaluation needed proposal needs to be validated or tested before fully implementing it in k6 help wanted refactor ux

Comments

@mstoykov
Copy link
Collaborator

While #1059 will do a lot of good for loading of scripts and their storage in archives:

  1. We will correctly load remote scripts from archives instead of making http requests
  2. You can now use full URLs with scheme like file://C:/path/to/somewhere.js or https://example.com/test.js
  3. Fixing a lot of issues - see the PR

During the development of #1059 multiple times I found out that I need to make more changes or add more problems to be solved in the future. The current PR is more or less the minimal amount of code that fixes problems without introducing new ones (mostly :(). Along the way two things became apparent:

  1. The whole loader package should be rewritten in a way that is more encapsulated. @na-- gave a good explanation at what that will probably look like in a comment
  2. Putting the (relevant parts of a ) user filesystem into an archive and having it run on another system is hard and there are a lot of problems ... especially with windows.

For this I add this new issue where all current problems will be listed and a solution to them will be drafted, possibly with some problems being marked as non issues:

  1. Currently and especially after Rewrite script/files loading to be be url based #1059 you will be able to have multiple names for the same file and Rewrite script/files loading to be be url based #1059 for various reasons doesn't fix all of them. This means that if you import a utils.js file with for example example.com/utils.js and https://example.com/utils.js it will be loaded and ran twice. The problems is even bigger if that package in hosted on github/cdnjs and someone uses the shortcut urls as that will be loaded again ...
  2. Case insensitivity of (parts of) URLs and some filesystems: By rfc3986 both the scheme and host part of a URI are case insensitive. While that was reported only the scheme is now lowered cased by default in golang. Which means that currently https://example.com/utils.js and https://eXample.com/utils.js are different files for k6 and will be loaded and executed separately.
    Additionally windows is in general case insenstive, although you can make it case sensitive and I was told that macos's can be case insensitive, and I won't be surprised if you can make the same for one of the many filesystems that linux uses. So ... we will have to even check whether the filesystem is or is not case insensitive for the particular path we are using it and than ... do something ?!?
  3. Because of the way golang's filepath works C:\\something is not recognised as absolute path on linux ... obviously which means that we need to specifically copy the code from filepath in order to when we execute an archive which imports C:\\something to know that is absolute file path and not a url with scheme C and (possibly) opaque \\something. This has been done in numerous places as filepath also doesn't transform \ to / on linux ... as ... it is OS dependant. (if the user uses file://C:/something that will work as there is no ambiguity)
  4. Currently if you import cdnjs.com/library/Faker this will make a url request findout the current versions of Faker and then get the url for last one and download that. Currently we write this as https://cdnjs.com/library/Faker in the archive so that we use that specific version when it is ran (also because that was somewhat how was done previously, so for backwards compatibility). This is similar to how github.com/something/else works but without making the intermediate request. This means that if someone uses the exact url this will download and run the script again.
  5. Something that I forget

Given the fact that on the k6 site it says

k6 is a developer centric open source load and performance regression testing tool for testing the performance of your cloud native backend infrastructure: APIs, microservices, serverless, containers and websites. It’s built to integrate well into your development workflow and CI/CD automation pipelines.

I would expect that the scripts that are written to be executed for k6 should work on both windows and linux and macos and whatever else we manage to compile k6 for. Given this having absolute paths and paths with backward slashes in scripts seems to be ... bad idea. While we might manage to fix all of those problems it will be strange to have backward slashes on linux ... Having forward ones on windows is at least ... supported, even if not the standard (for some unknown to me reason).

Given all of that my main proposal for fixing the above problems is to basically not fix 2 and 3 (apart from maybe lower casing the host always ). We should probably start giving warnings to people who use absolute paths and backward slashes, with I propose moving that functionality behind a flag(s) in a year and completely dropping it. And the warning should say that the an archive produced from a script using backward slashes and absolute paths may not work in the future and may not work in the cloud offering even now.
For 1. I would expect that this will be "fixed" on the next rewrite where we completely remove the loader package and write something along the lines of the ExecutionEnvironment that @na-- proposed.
Just for the record I haven't found many places where node.js advertises they support absolute paths and practically no mentions of it. In their esm proposal it is specifically mentioned that almost everything is URL and if you want absolute paths you need to use file://. We can still support moduleSpecifiers which are absolute in this case ... and we can probably even guarantee that they will work from an archive although I still think it is a bad idea to use them and should still spill a warning :)

Alternatively we can copy the windows filepath code and start calling it even on windows for some scripts ... maybe if we run it from an archive that was made on windows. But then there is difference between running from an archive and having the code on the filesystem ... We can always run all sanitization in an effort to have the same code that work on windows (without absolute paths obviously) to work on linux/macos but if someone made their files with uppercase first late Utils.js but reference them with utils.js that won't work ... I am certain there are more places where this will break ...

Another possibility is to ... stop trying to bridge this gap, leave things as are and add a resolveMap(name is up for discussion :)) in order to be able to get the same resolved URL when running from an archive as was resolved when the archive was created.
k6 will record for each moduleSpecifier what it was resolved to (with the pwd it was resolved from) and than write this to the metadata.json. When we run from the archive k6 will use this map to the same resolution. This way archives will be portable, but code will still break if you move them between machines/OSes. The good things about this is that we can write only the REAL final URL (with some sanitizaiton) to the archive and have the map fix the problems with cdnjs/github.com/schemeless urls(/case insensitive filesystems? if we decide that windows is always case insensitive and just lower case everything?)
In this case we will probably have os dependant implementations of the resolving code and possibly even drop support for paths with backward slashes from linux in the future.

@mstoykov mstoykov added enhancement evaluation needed proposal needs to be validated or tested before fully implementing it in k6 help wanted question refactor ux labels Jul 24, 2019
@na-- na-- removed the question label Jul 24, 2019
@na--
Copy link
Member

na-- commented Jul 24, 2019

I'm against a resolveMap, since that seems very fragile and frankly, both unnecessary and unnecessarily complex. I don't think fixing any of the issues you've listed is urgent at all, they all seem fairly marginal and low-priority, but I think all of them are fixable with the right abstractions. Hopefully I'm not pinning too high hopes on the better future abstractions, but one of my chief concerns with #1059 is its complexity and leakage of low-level implementation details everywhere, so hopefully we can both fix the bugs and improve those in the future...

In my head, if we stop relying on the OS-dependent filepath module, then we can write a very straightforward, pure and unambiguous function resolveAbsoluteForwardSlashedPath(osType, currentAbsolutePath, relativeOrAbsolutePathToResolve) resolvedPath. The biggest obstacle to this is that we'd need to copy the windows filepath handling code, but it would allow us to execute all Windows-generated archives without any issues on a Linux or Mac machine. Even ignoring the absolute paths on Windows, something like that could also be used to deal with the Windows case-insensitive filenames, which to me is probably the bigger issue of the two.

That said, I could also be persuaded that using absolute Windows file paths (e.g. C:\whatever\myscript.js) is sort of stupid and we should discourage and/or deprecate it. It doesn't make a lot of sense to import files that way, even if there's only one person touching the code. My current thinking is that if we try to properly handle the differences between the operating systems instead of {ignoring them and/or working around them and/or relying on an OS-dependent piece of code}, then it won't be a huge hassle to support absolute or case-insensitive paths. Though I may very well be wrong on these points...

@na--
Copy link
Member

na-- commented Aug 16, 2019

Something else that would be nice is to remove the archive code from the main /lib folder and move it to a submodule. Having it in the main module is making it difficult to test things, because those often depend on things outside of lib, which in tern import the basic interfaces in lib, causing import loops...

@mstoykov
Copy link
Collaborator Author

I would like to recommend on looking into

We should probably start giving warnings to people who use absolute paths and backward slashes, with I propose moving that functionality behind a flag(s) in a year and completely dropping it. And the warning should say that the an archive produced from a script using backward slashes and absolute paths may not work in the future and may not work in the cloud offering even now.

At least on showing a warning that using absolute paths is not a good idea, and if not used with file://, archives might not be runnable in the k6 cloud or when an archived and moved between machines.

@na--
Copy link
Member

na-- commented Jun 25, 2021

Yeah, showing a warning is probably a good idea 👍 Though if we can show a warning for Windows absolute paths like C:/path to use file://C/path instead, can't we just rewrite them on the fly? 😅

@mstoykov
Copy link
Collaborator Author

No, because absolute path are platform dependant

@mstoykov
Copy link
Collaborator Author

I am closing this issue - most of the issues here have not been hit by most users.

I have never seen a user using two different cases for hte same import or heard of such a case.
The magic urls are going away.
So the only problem is that absolute paths on windows aren't particularly cross-platform - which should have it's own issue. So I am openning #3867

@mstoykov mstoykov closed this as not planned Won't fix, can't repro, duplicate, stale Jul 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement evaluation needed proposal needs to be validated or tested before fully implementing it in k6 help wanted refactor ux
2 participants