c61075e20e
2 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
c61075e20e
|
Add string/binary type color to ByteStream (#12897)
# Description This PR allows byte streams to optionally be colored as being specifically binary or string data, which guarantees that they'll be converted to `Binary` or `String` appropriately on `into_value()`, making them compatible with `Type` guarantees. This makes them significantly more broadly usable for command input and output. There is still an `Unknown` type for byte streams coming from external commands, which uses the same behavior as we previously did where it's a string if it's UTF-8. A small number of commands were updated to take advantage of this, just to prove the point. I will be adding more after this merges. # User-Facing Changes - New types in `describe`: `string (stream)`, `binary (stream)` - These commands now return a stream if their input was a stream: - `into binary` - `into string` - `bytes collect` - `str join` - `first` (binary) - `last` (binary) - `take` (binary) - `skip` (binary) - Streams that are explicitly binary colored will print as a streaming hexdump - example: ```nushell 1.. | each { into binary } | bytes collect ``` # Tests + Formatting I've added some tests to cover it at a basic level, and it doesn't break anything existing, but I do think more would be nice. Some of those will come when I modify more commands to stream. # After Submitting There are a few things I'm not quite satisfied with: - **String trimming behavior.** We automatically trim newlines from streams from external commands, but I don't think we should do this with internal commands. If I call a command that happens to turn my string into a stream, I don't want the newline to suddenly disappear. I changed this to specifically do it only on `Child` and `File`, but I don't know if this is quite right, and maybe we should bring back the old flag for `trim_end_newline` - **Known binary always resulting in a hexdump.** It would be nice to have a `print --raw`, so that we can put binary data on stdout explicitly if we want to. This PR doesn't change how external commands work though - they still dump straight to stdout. Otherwise, here's the normal checklist: - [ ] release notes - [ ] docs update for plugin protocol changes (added `type` field) --------- Co-authored-by: Ian Manske <ian.manske@pm.me> |
||
|
6fd854ed9f
|
Replace ExternalStream with new ByteStream type (#12774)
# Description This PR introduces a `ByteStream` type which is a `Read`-able stream of bytes. Internally, it has an enum over three different byte stream sources: ```rust pub enum ByteStreamSource { Read(Box<dyn Read + Send + 'static>), File(File), Child(ChildProcess), } ``` This is in comparison to the current `RawStream` type, which is an `Iterator<Item = Vec<u8>>` and has to allocate for each read chunk. Currently, `PipelineData::ExternalStream` serves a weird dual role where it is either external command output or a wrapper around `RawStream`. `ByteStream` makes this distinction more clear (via `ByteStreamSource`) and replaces `PipelineData::ExternalStream` in this PR: ```rust pub enum PipelineData { Empty, Value(Value, Option<PipelineMetadata>), ListStream(ListStream, Option<PipelineMetadata>), ByteStream(ByteStream, Option<PipelineMetadata>), } ``` The PR is relatively large, but a decent amount of it is just repetitive changes. This PR fixes #7017, fixes #10763, and fixes #12369. This PR also improves performance when piping external commands. Nushell should, in most cases, have competitive pipeline throughput compared to, e.g., bash. | Command | Before (MB/s) | After (MB/s) | Bash (MB/s) | | -------------------------------------------------- | -------------:| ------------:| -----------:| | `throughput \| rg 'x'` | 3059 | 3744 | 3739 | | `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 | # User-Facing Changes - This is a breaking change for the plugin communication protocol, because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`. Plugins now only have to deal with a single input stream, as opposed to the previous three streams: stdout, stderr, and exit code. - The output of `describe` has been changed for external/byte streams. - Temporary breaking change: `bytes starts-with` no longer works with byte streams. This is to keep the PR smaller, and `bytes ends-with` already does not work on byte streams. - If a process core dumped, then instead of having a `Value::Error` in the `exit_code` column of the output returned from `complete`, it now is a `Value::Int` with the negation of the signal number. # After Submitting - Update docs and book as necessary - Release notes (e.g., plugin protocol changes) - Adapt/convert commands to work with byte streams (high priority is `str length`, `bytes starts-with`, and maybe `bytes ends-with`). - Refactor the `tee` code, Devyn has already done some work on this. --------- Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com> |