37ef542a1b
5 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
6fd854ed9f
|
Replace ExternalStream with new ByteStream type (#12774)
# Description This PR introduces a `ByteStream` type which is a `Read`-able stream of bytes. Internally, it has an enum over three different byte stream sources: ```rust pub enum ByteStreamSource { Read(Box<dyn Read + Send + 'static>), File(File), Child(ChildProcess), } ``` This is in comparison to the current `RawStream` type, which is an `Iterator<Item = Vec<u8>>` and has to allocate for each read chunk. Currently, `PipelineData::ExternalStream` serves a weird dual role where it is either external command output or a wrapper around `RawStream`. `ByteStream` makes this distinction more clear (via `ByteStreamSource`) and replaces `PipelineData::ExternalStream` in this PR: ```rust pub enum PipelineData { Empty, Value(Value, Option<PipelineMetadata>), ListStream(ListStream, Option<PipelineMetadata>), ByteStream(ByteStream, Option<PipelineMetadata>), } ``` The PR is relatively large, but a decent amount of it is just repetitive changes. This PR fixes #7017, fixes #10763, and fixes #12369. This PR also improves performance when piping external commands. Nushell should, in most cases, have competitive pipeline throughput compared to, e.g., bash. | Command | Before (MB/s) | After (MB/s) | Bash (MB/s) | | -------------------------------------------------- | -------------:| ------------:| -----------:| | `throughput \| rg 'x'` | 3059 | 3744 | 3739 | | `throughput \| nu --testbin relay o> /dev/null` | 3508 | 8087 | 8136 | # User-Facing Changes - This is a breaking change for the plugin communication protocol, because the `ExternalStreamInfo` was replaced with `ByteStreamInfo`. Plugins now only have to deal with a single input stream, as opposed to the previous three streams: stdout, stderr, and exit code. - The output of `describe` has been changed for external/byte streams. - Temporary breaking change: `bytes starts-with` no longer works with byte streams. This is to keep the PR smaller, and `bytes ends-with` already does not work on byte streams. - If a process core dumped, then instead of having a `Value::Error` in the `exit_code` column of the output returned from `complete`, it now is a `Value::Int` with the negation of the signal number. # After Submitting - Update docs and book as necessary - Release notes (e.g., plugin protocol changes) - Adapt/convert commands to work with byte streams (high priority is `str length`, `bytes starts-with`, and maybe `bytes ends-with`). - Refactor the `tee` code, Devyn has already done some work on this. --------- Co-authored-by: Devyn Cairns <devyn.cairns@gmail.com> |
||
|
c747ec75c9
|
Add command_prelude module (#12291)
# Description When implementing a `Command`, one must also import all the types present in the function signatures for `Command`. This makes it so that we often import the same set of types in each command implementation file. E.g., something like this: ```rust use nu_protocol::ast::Call; use nu_protocol::engine::{Command, EngineState, Stack}; use nu_protocol::{ record, Category, Example, IntoInterruptiblePipelineData, IntoPipelineData, PipelineData, ShellError, Signature, Span, Type, Value, }; ``` This PR adds the `nu_engine::command_prelude` module which contains the necessary and commonly used types to implement a `Command`: ```rust // command_prelude.rs pub use crate::CallExt; pub use nu_protocol::{ ast::{Call, CellPath}, engine::{Command, EngineState, Stack}, record, Category, Example, IntoInterruptiblePipelineData, IntoPipelineData, IntoSpanned, PipelineData, Record, ShellError, Signature, Span, Spanned, SyntaxShape, Type, Value, }; ``` This should reduce the boilerplate needed to implement a command and also gives us a place to track the breadth of the `Command` API. I tried to be conservative with what went into the prelude modules, since it might be hard/annoying to remove items from the prelude in the future. Let me know if something should be included or excluded. |
||
|
f879c00f9d
|
The ability to specify a schema when using dfr open and dfr into-df (#11634)
# Description There are times where explicitly specifying a schema for a dataframe is needed such as: - Opening CSV and JSON lines files and needing provide more information to polars to keep it from failing or in a desire to override default type conversion - When converting a nushell value to a dataframe and wanting to override the default conversion behaviors. This pull requests provides: - A flag to allow specifying a schema when using dfr into-df - A flag to allow specifying a schema when using dfr open that works for CSV and JSON types - A new command `dfr schema` which displays schema information and will allow display support schema dtypes Schema is specified creating a record that has the key value and the dtype. Examples usages: ``` {a:1, b:{a:2}} | dfr into-df -s {a: u8, b: {a: i32}} | dfr schema {a: 1, b: {a: [1 2 3]}, c: [a b c]} | dfr into-df -s {a: u8, b: {a: list<u64>}, c: list<str>} | dfr schema dfr open -s {pid: i32, ppid: i32, name: str, status: str, cpu: f64, mem: i64, virtual: i64} /tmp/ps.jsonl | dfr schema ``` Supported dtypes: null bool u8 u16 u32 u64 i8 i16 i32 i64 f32 f64 str binary date datetime[time_unit: (ms, us, ns) timezone (optional)] duration[time_unit: (ms, us, ns)] time object unknown list[dtype] structs are also supported but are specified via another record: {a: u8, b: {d: str}} Another feature with the dfr schema command is that it returns the data back in a format that can be passed to provide a valid schema that can be passed in as schema argument: <img width="638" alt="Screenshot 2024-01-29 at 10 23 58" src="https://github.com/nushell/nushell/assets/56345/b49c3bff-5cda-4c86-975a-dfd91d991373"> --------- Co-authored-by: Jack Wright <jack.wright@disqo.com> |
||
|
9068093081
|
Improve type hovers (#9515)
# Description This PR does a few things to help improve type hovers and, in the process, fixes a few outstanding issues in the type system. Here's a list of the changes: * `for` now will try to infer the type of the iteration variable based on the expression it's given. This fixes things like `for x in [1, 2, 3] { }` where `x` now properly gets the int type. * Removed old input/output type fields from the signature, focuses on the vec of signatures. Updated a bunch of dataframe commands that hadn't moved over. This helps tie things together a bit better * Fixed inference of types from subexpressions to use the last expression in the block * Fixed handling of explicit types in `let` and `mut` calls, so we now respect that as the authoritative type I also tried to add `def` input/output type inference, but unfortunately we only know the predecl types universally, which means we won't have enough information to properly know what the types of the custom commands are. # User-Facing Changes Script typechecking will get tighter in some cases Hovers should be more accurate in some cases that previously resorted to any. # Tests + Formatting <!-- Don't forget to add tests that cover your changes. Make sure you've run and fixed any issues with these commands: - `cargo fmt --all -- --check` to check standard code formatting (`cargo fmt --all` applies these changes) - `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used -A clippy::needless_collect -A clippy::result_large_err` to check that you're using the standard code style - `cargo test --workspace` to check that all tests pass - `cargo run -- crates/nu-std/tests/run.nu` to run the tests for the standard library > **Note** > from `nushell` you can also use the `toolkit` as follows > ```bash > use toolkit.nu # or use an `env_change` hook to activate it automatically > toolkit check pr > ``` --> # After Submitting <!-- If your PR had any user-facing changes, update [the documentation](https://github.com/nushell/nushell.github.io) after the PR is merged, if necessary. This will help us keep the docs up to date. --> --------- Co-authored-by: Darren Schroeder <343840+fdncred@users.noreply.github.com> |
||
|
c55b5c0a55
|
move dataframe commands to nu-cmd-dataframe (#9241)
All of the dataframe commands ported over with no issues... ### 11 tests are commented out (for now) So 100 of the original 111 tests are passing with only 11 tests being ignored for now.. As per our conversation in the core team meeting on Wednesday I took @jntrnr suggestion and just commented out the tests dealing with [IntoDatetime](https://github.com/nushell/nushell/blob/main/crates/nu-command/src/conversions/into/mod.rs) Later on we can move this functionality out of nu-command if we decide it makes sense... ### The following tests were ignored... ```rust modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_day.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_hour.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_minute.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_month.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_nanosecond.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_ordinal.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_second.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_week.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_weekday.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/date/get_year.rs modified: crates/nu-cmd-dataframe/src/dataframe/series/string/strftime.rs ``` |