From aa7d7d0cc3bffcf4323291a7a4163e41dfbee6dd Mon Sep 17 00:00:00 2001
From: Devyn Cairns <devyn.cairns@gmail.com>
Date: Wed, 17 Jul 2024 14:02:42 -0700
Subject: [PATCH 01/14] Overhaul `$in` expressions (#13357)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

# Description

This grew quite a bit beyond its original scope, but I've tried to make
`$in` a bit more consistent and easier to work with.

Instead of the parser generating calls to `collect` and creating
closures, this adds `Expr::Collect` which just evaluates in the same
scope and doesn't require any closure.

When `$in` is detected in an expression, it is replaced with a new
variable (also called `$in`) and wrapped in `Expr::Collect`. During
eval, this expression is evaluated directly, with the input and with
that new variable set to the collected value.

Other than being faster and less prone to gotchas, it also makes it
possible to typecheck the output of an expression containing `$in`,
which is nice. This is a breaking change though, because of the lack of
the closure and because now typechecking will actually happen. Also, I
haven't attempted to typecheck the input yet.

The IR generated now just looks like this:

```gas
collect        %in
clone          %tmp, %in
store-variable $in, %tmp
# %out <- ...expression... <- %in
drop-variable  $in
```

(where `$in` is the local variable created for this collection, and not
`IN_VARIABLE_ID`)

which is a lot better than having to create a closure and call `collect
--keep-env`, dealing with all of the capture gathering and allocation
that entails. Ideally we can also detect whether that input is actually
needed, so maybe we don't have to clone, but I haven't tried to do that
yet. Theoretically now that the variable is a unique one every time, it
should be possible to give it a type - I just don't know how to
determine that yet.

On top of that, I've also reworked how `$in` works in pipeline-initial
position. Previously, it was a little bit inconsistent. For example,
this worked:

```nushell
> 3 | do { let x = $in; let y = $in; print $x $y }
3
3
```

However, this causes a runtime variable not found error on the second
`$in`:

```nushell
> def foo [] { let x = $in; let y = $in; print $x $y }; 3 | foo
Error: nu::shell::variable_not_found

  × Variable not found
   ╭─[entry #115:1:35]
 1 │ def foo [] { let x = $in; let y = $in; print $x $y }; 3 | foo
   ·                                   ─┬─
   ·                                    ╰── variable not found
   ╰────
```

I've fixed this by making the first element `$in` detection *always*
happen at the block level, so if you use `$in` in pipeline-initial
position anywhere in a block, it will collect with an implicit
subexpression around the whole thing, and you can then use that `$in`
more than once. In doing this I also rewrote `parse_pipeline()` and
hopefully it's a bit more straightforward and possibly more efficient
too now.

Finally, I've tried to make `let` and `mut` a lot more straightforward
with how they handle the rest of the pipeline, and using a redirection
with `let`/`mut` now does what you'd expect if you assume that they
consume the whole pipeline - the redirection is just processed as
normal. These both work now:

```nushell
let x = ^foo err> err.txt
let y = ^foo out+err>| str length
```

It was previously possible to accomplish this with a subexpression, but
it just seemed like a weird gotcha that you couldn't do it. Intuitively,
`let` and `mut` just seem to take the whole line.

- closes #13137

# User-Facing Changes
- `$in` will behave more consistently with blocks and closures, since
the entire block is now just wrapped to handle it if it appears in the
first pipeline element
- `$in` no longer creates a closure, so what can be done within an
expression containing `$in` is less restrictive
- `$in` containing expressions are now type checked, rather than just
resulting in `any`. However, `$in` itself is still `any`, so this isn't
quite perfect yet
- Redirections are now allowed in `let` and `mut` and behave pretty much
how you'd expect

# Tests + Formatting
Added tests to cover the new behaviour.

# After Submitting
- [ ] release notes (definitely breaking change)
---
 crates/nu-cli/src/syntax_highlight.rs         |   8 +
 crates/nu-cmd-lang/src/example_support.rs     |   7 +-
 crates/nu-command/src/example_test.rs         |   3 +-
 .../nu-command/tests/commands/redirection.rs  |  34 ++
 crates/nu-engine/src/compile/builder.rs       |   1 +
 crates/nu-engine/src/compile/expression.rs    |  21 ++
 crates/nu-engine/src/eval.rs                  |  58 ++-
 crates/nu-engine/src/eval_ir.rs               |  22 +-
 crates/nu-parser/src/flatten.rs               |   3 +
 crates/nu-parser/src/lite_parser.rs           |  29 ++
 crates/nu-parser/src/parser.rs                | 355 +++++++-----------
 crates/nu-parser/tests/test_parser.rs         |  63 +++-
 crates/nu-protocol/src/ast/block.rs           |  13 +
 crates/nu-protocol/src/ast/expr.rs            |   9 +-
 crates/nu-protocol/src/ast/expression.rs      | 166 ++++++--
 crates/nu-protocol/src/ast/pipeline.rs        |  43 ++-
 crates/nu-protocol/src/debugger/profiler.rs   |   1 +
 crates/nu-protocol/src/eval_base.rs           |  10 +
 crates/nu-protocol/src/eval_const.rs          |   9 +
 crates/nu-protocol/src/ir/display.rs          |   4 +
 crates/nu-protocol/src/ir/mod.rs              |   3 +
 crates/nuon/src/from.rs                       |   6 +
 tests/repl/test_engine.rs                     |  23 ++
 tests/repl/test_type_check.rs                 |  13 +
 24 files changed, 631 insertions(+), 273 deletions(-)

diff --git a/crates/nu-cli/src/syntax_highlight.rs b/crates/nu-cli/src/syntax_highlight.rs
index 08dcbb1346..ebd8315190 100644
--- a/crates/nu-cli/src/syntax_highlight.rs
+++ b/crates/nu-cli/src/syntax_highlight.rs
@@ -429,6 +429,14 @@ fn find_matching_block_end_in_expr(
                 )
             }),
 
+            Expr::Collect(_, expr) => find_matching_block_end_in_expr(
+                line,
+                working_set,
+                expr,
+                global_span_offset,
+                global_cursor_offset,
+            ),
+
             Expr::Block(block_id)
             | Expr::Closure(block_id)
             | Expr::RowCondition(block_id)
diff --git a/crates/nu-cmd-lang/src/example_support.rs b/crates/nu-cmd-lang/src/example_support.rs
index bb03bbaf8c..72ea36972f 100644
--- a/crates/nu-cmd-lang/src/example_support.rs
+++ b/crates/nu-cmd-lang/src/example_support.rs
@@ -4,7 +4,7 @@ use nu_protocol::{
     ast::Block,
     debugger::WithoutDebug,
     engine::{StateDelta, StateWorkingSet},
-    Range,
+    report_error_new, Range,
 };
 use std::{
     sync::Arc,
@@ -124,7 +124,10 @@ pub fn eval_block(
 
     nu_engine::eval_block::<WithoutDebug>(engine_state, &mut stack, &block, input)
         .and_then(|data| data.into_value(Span::test_data()))
-        .unwrap_or_else(|err| panic!("test eval error in `{}`: {:?}", "TODO", err))
+        .unwrap_or_else(|err| {
+            report_error_new(engine_state, &err);
+            panic!("test eval error in `{}`: {:?}", "TODO", err)
+        })
 }
 
 pub fn check_example_evaluates_to_expected_output(
diff --git a/crates/nu-command/src/example_test.rs b/crates/nu-command/src/example_test.rs
index 476113aa2d..4879cab0a9 100644
--- a/crates/nu-command/src/example_test.rs
+++ b/crates/nu-command/src/example_test.rs
@@ -31,7 +31,7 @@ mod test_examples {
         check_example_evaluates_to_expected_output,
         check_example_input_and_output_types_match_command_signature,
     };
-    use nu_cmd_lang::{Break, Echo, If, Let, Mut};
+    use nu_cmd_lang::{Break, Describe, Echo, If, Let, Mut};
     use nu_protocol::{
         engine::{Command, EngineState, StateWorkingSet},
         Type,
@@ -81,6 +81,7 @@ mod test_examples {
             working_set.add_decl(Box::new(Break));
             working_set.add_decl(Box::new(Date));
             working_set.add_decl(Box::new(Default));
+            working_set.add_decl(Box::new(Describe));
             working_set.add_decl(Box::new(Each));
             working_set.add_decl(Box::new(Echo));
             working_set.add_decl(Box::new(Enumerate));
diff --git a/crates/nu-command/tests/commands/redirection.rs b/crates/nu-command/tests/commands/redirection.rs
index 1eb57077b9..02f85a90f8 100644
--- a/crates/nu-command/tests/commands/redirection.rs
+++ b/crates/nu-command/tests/commands/redirection.rs
@@ -439,3 +439,37 @@ fn no_duplicate_redirection() {
         );
     });
 }
+
+#[rstest::rstest]
+#[case("let", "out>")]
+#[case("let", "err>")]
+#[case("let", "out+err>")]
+#[case("mut", "out>")]
+#[case("mut", "err>")]
+#[case("mut", "out+err>")]
+fn file_redirection_in_let_and_mut(#[case] keyword: &str, #[case] redir: &str) {
+    Playground::setup("file redirection in let and mut", |dirs, _| {
+        let actual = nu!(
+            cwd: dirs.test(),
+            format!("$env.BAZ = 'foo'; {keyword} v = nu --testbin echo_env_mixed out-err BAZ BAZ {redir} result.txt")
+        );
+        assert!(actual.status.success());
+        assert!(file_contents(dirs.test().join("result.txt")).contains("foo"));
+    })
+}
+
+#[rstest::rstest]
+#[case("let", "err>|", "foo3")]
+#[case("let", "out+err>|", "7")]
+#[case("mut", "err>|", "foo3")]
+#[case("mut", "out+err>|", "7")]
+fn pipe_redirection_in_let_and_mut(
+    #[case] keyword: &str,
+    #[case] redir: &str,
+    #[case] output: &str,
+) {
+    let actual = nu!(
+        format!("$env.BAZ = 'foo'; {keyword} v = nu --testbin echo_env_mixed out-err BAZ BAZ {redir} str length; $v")
+    );
+    assert_eq!(actual.out, output);
+}
diff --git a/crates/nu-engine/src/compile/builder.rs b/crates/nu-engine/src/compile/builder.rs
index d77075cc3f..4294b2bd61 100644
--- a/crates/nu-engine/src/compile/builder.rs
+++ b/crates/nu-engine/src/compile/builder.rs
@@ -204,6 +204,7 @@ impl BlockBuilder {
             Instruction::Drain { src } => allocate(&[*src], &[]),
             Instruction::LoadVariable { dst, var_id: _ } => allocate(&[], &[*dst]),
             Instruction::StoreVariable { var_id: _, src } => allocate(&[*src], &[]),
+            Instruction::DropVariable { var_id: _ } => Ok(()),
             Instruction::LoadEnv { dst, key: _ } => allocate(&[], &[*dst]),
             Instruction::LoadEnvOpt { dst, key: _ } => allocate(&[], &[*dst]),
             Instruction::StoreEnv { key: _, src } => allocate(&[*src], &[]),
diff --git a/crates/nu-engine/src/compile/expression.rs b/crates/nu-engine/src/compile/expression.rs
index 38ee58ea26..948eb5c238 100644
--- a/crates/nu-engine/src/compile/expression.rs
+++ b/crates/nu-engine/src/compile/expression.rs
@@ -171,6 +171,27 @@ pub(crate) fn compile_expression(
                 Err(CompileError::UnsupportedOperatorExpression { span: op.span })
             }
         }
+        Expr::Collect(var_id, expr) => {
+            let store_reg = if let Some(in_reg) = in_reg {
+                // Collect, clone, store
+                builder.push(Instruction::Collect { src_dst: in_reg }.into_spanned(expr.span))?;
+                builder.clone_reg(in_reg, expr.span)?
+            } else {
+                // Just store nothing in the variable
+                builder.literal(Literal::Nothing.into_spanned(Span::unknown()))?
+            };
+            builder.push(
+                Instruction::StoreVariable {
+                    var_id: *var_id,
+                    src: store_reg,
+                }
+                .into_spanned(expr.span),
+            )?;
+            compile_expression(working_set, builder, expr, redirect_modes, in_reg, out_reg)?;
+            // Clean it up afterward
+            builder.push(Instruction::DropVariable { var_id: *var_id }.into_spanned(expr.span))?;
+            Ok(())
+        }
         Expr::Subexpression(block_id) => {
             let block = working_set.get_block(*block_id);
             compile_block(working_set, builder, block, redirect_modes, in_reg, out_reg)
diff --git a/crates/nu-engine/src/eval.rs b/crates/nu-engine/src/eval.rs
index 42d0aed288..84fa700934 100644
--- a/crates/nu-engine/src/eval.rs
+++ b/crates/nu-engine/src/eval.rs
@@ -10,8 +10,8 @@ use nu_protocol::{
     debugger::DebugContext,
     engine::{Closure, EngineState, Redirection, Stack, StateWorkingSet},
     eval_base::Eval,
-    ByteStreamSource, Config, FromValue, IntoPipelineData, OutDest, PipelineData, ShellError, Span,
-    Spanned, Type, Value, VarId, ENV_VARIABLE_ID,
+    ByteStreamSource, Config, DataSource, FromValue, IntoPipelineData, OutDest, PipelineData,
+    PipelineMetadata, ShellError, Span, Spanned, Type, Value, VarId, ENV_VARIABLE_ID,
 };
 use nu_utils::IgnoreCaseExt;
 use std::{fs::OpenOptions, path::PathBuf, sync::Arc};
@@ -259,6 +259,10 @@ pub fn eval_expression_with_input<D: DebugContext>(
             input = eval_external(engine_state, stack, head, args, input)?;
         }
 
+        Expr::Collect(var_id, expr) => {
+            input = eval_collect::<D>(engine_state, stack, *var_id, expr, input)?;
+        }
+
         Expr::Subexpression(block_id) => {
             let block = engine_state.get_block(*block_id);
             // FIXME: protect this collect with ctrl-c
@@ -605,6 +609,44 @@ pub fn eval_block<D: DebugContext>(
     Ok(input)
 }
 
+pub fn eval_collect<D: DebugContext>(
+    engine_state: &EngineState,
+    stack: &mut Stack,
+    var_id: VarId,
+    expr: &Expression,
+    input: PipelineData,
+) -> Result<PipelineData, ShellError> {
+    // Evaluate the expression with the variable set to the collected input
+    let span = input.span().unwrap_or(Span::unknown());
+
+    let metadata = match input.metadata() {
+        // Remove the `FilePath` metadata, because after `collect` it's no longer necessary to
+        // check where some input came from.
+        Some(PipelineMetadata {
+            data_source: DataSource::FilePath(_),
+            content_type: None,
+        }) => None,
+        other => other,
+    };
+
+    let input = input.into_value(span)?;
+
+    stack.add_var(var_id, input.clone());
+
+    let result = eval_expression_with_input::<D>(
+        engine_state,
+        stack,
+        expr,
+        // We still have to pass it as input
+        input.into_pipeline_data_with_metadata(metadata),
+    )
+    .map(|(result, _failed)| result);
+
+    stack.remove_var(var_id);
+
+    result
+}
+
 pub fn eval_subexpression<D: DebugContext>(
     engine_state: &EngineState,
     stack: &mut Stack,
@@ -729,6 +771,18 @@ impl Eval for EvalRuntime {
         eval_external(engine_state, stack, head, args, PipelineData::empty())?.into_value(span)
     }
 
+    fn eval_collect<D: DebugContext>(
+        engine_state: &EngineState,
+        stack: &mut Stack,
+        var_id: VarId,
+        expr: &Expression,
+    ) -> Result<Value, ShellError> {
+        // It's a little bizarre, but the expression can still have some kind of result even with
+        // nothing input
+        eval_collect::<D>(engine_state, stack, var_id, expr, PipelineData::empty())?
+            .into_value(expr.span)
+    }
+
     fn eval_subexpression<D: DebugContext>(
         engine_state: &EngineState,
         stack: &mut Stack,
diff --git a/crates/nu-engine/src/eval_ir.rs b/crates/nu-engine/src/eval_ir.rs
index 29a677bd69..8550cbab99 100644
--- a/crates/nu-engine/src/eval_ir.rs
+++ b/crates/nu-engine/src/eval_ir.rs
@@ -6,9 +6,9 @@ use nu_protocol::{
     debugger::DebugContext,
     engine::{Argument, Closure, EngineState, ErrorHandler, Matcher, Redirection, Stack},
     ir::{Call, DataSlice, Instruction, IrAstRef, IrBlock, Literal, RedirectMode},
-    record, ByteStreamSource, DeclId, ErrSpan, Flag, IntoPipelineData, IntoSpanned, ListStream,
-    OutDest, PipelineData, PositionalArg, Range, Record, RegId, ShellError, Signals, Signature,
-    Span, Spanned, Type, Value, VarId, ENV_VARIABLE_ID,
+    record, ByteStreamSource, DataSource, DeclId, ErrSpan, Flag, IntoPipelineData, IntoSpanned,
+    ListStream, OutDest, PipelineData, PipelineMetadata, PositionalArg, Range, Record, RegId,
+    ShellError, Signals, Signature, Span, Spanned, Type, Value, VarId, ENV_VARIABLE_ID,
 };
 use nu_utils::IgnoreCaseExt;
 
@@ -345,6 +345,10 @@ fn eval_instruction<D: DebugContext>(
             ctx.stack.add_var(*var_id, value);
             Ok(Continue)
         }
+        Instruction::DropVariable { var_id } => {
+            ctx.stack.remove_var(*var_id);
+            Ok(Continue)
+        }
         Instruction::LoadEnv { dst, key } => {
             let key = ctx.get_str(*key, *span)?;
             if let Some(value) = get_env_var_case_insensitive(ctx, key) {
@@ -1341,9 +1345,19 @@ fn get_env_var_name_case_insensitive<'a>(ctx: &mut EvalContext<'_>, key: &'a str
 }
 
 /// Helper to collect values into [`PipelineData`], preserving original span and metadata
+///
+/// The metadata is removed if it is the file data source, as that's just meant to mark streams.
 fn collect(data: PipelineData, fallback_span: Span) -> Result<PipelineData, ShellError> {
     let span = data.span().unwrap_or(fallback_span);
-    let metadata = data.metadata();
+    let metadata = match data.metadata() {
+        // Remove the `FilePath` metadata, because after `collect` it's no longer necessary to
+        // check where some input came from.
+        Some(PipelineMetadata {
+            data_source: DataSource::FilePath(_),
+            content_type: None,
+        }) => None,
+        other => other,
+    };
     let value = data.into_value(span)?;
     Ok(PipelineData::Value(value, metadata))
 }
diff --git a/crates/nu-parser/src/flatten.rs b/crates/nu-parser/src/flatten.rs
index 88638b2475..8a05edaef4 100644
--- a/crates/nu-parser/src/flatten.rs
+++ b/crates/nu-parser/src/flatten.rs
@@ -189,6 +189,9 @@ fn flatten_expression_into(
             ));
             flatten_expression_into(working_set, not, output);
         }
+        Expr::Collect(_, expr) => {
+            flatten_expression_into(working_set, expr, output);
+        }
         Expr::Closure(block_id) => {
             let outer_span = expr.span;
 
diff --git a/crates/nu-parser/src/lite_parser.rs b/crates/nu-parser/src/lite_parser.rs
index f04b8befdc..c9e9578698 100644
--- a/crates/nu-parser/src/lite_parser.rs
+++ b/crates/nu-parser/src/lite_parser.rs
@@ -2,6 +2,7 @@
 //! can be parsed.
 
 use crate::{Token, TokenContents};
+use itertools::{Either, Itertools};
 use nu_protocol::{ast::RedirectionSource, ParseError, Span};
 use std::mem;
 
@@ -24,6 +25,15 @@ impl LiteRedirectionTarget {
             | LiteRedirectionTarget::Pipe { connector } => *connector,
         }
     }
+
+    pub fn spans(&self) -> impl Iterator<Item = Span> {
+        match *self {
+            LiteRedirectionTarget::File {
+                connector, file, ..
+            } => Either::Left([connector, file].into_iter()),
+            LiteRedirectionTarget::Pipe { connector } => Either::Right(std::iter::once(connector)),
+        }
+    }
 }
 
 #[derive(Debug, Clone)]
@@ -38,6 +48,17 @@ pub enum LiteRedirection {
     },
 }
 
+impl LiteRedirection {
+    pub fn spans(&self) -> impl Iterator<Item = Span> {
+        match self {
+            LiteRedirection::Single { target, .. } => Either::Left(target.spans()),
+            LiteRedirection::Separate { out, err } => {
+                Either::Right(out.spans().chain(err.spans()).sorted())
+            }
+        }
+    }
+}
+
 #[derive(Debug, Clone, Default)]
 pub struct LiteCommand {
     pub pipe: Option<Span>,
@@ -113,6 +134,14 @@ impl LiteCommand {
 
         Ok(())
     }
+
+    pub fn parts_including_redirection(&self) -> impl Iterator<Item = Span> + '_ {
+        self.parts.iter().copied().chain(
+            self.redirection
+                .iter()
+                .flat_map(|redirection| redirection.spans()),
+        )
+    }
 }
 
 #[derive(Debug, Clone, Default)]
diff --git a/crates/nu-parser/src/parser.rs b/crates/nu-parser/src/parser.rs
index fc2131aad7..bc48b9bd0c 100644
--- a/crates/nu-parser/src/parser.rs
+++ b/crates/nu-parser/src/parser.rs
@@ -5299,6 +5299,7 @@ pub fn parse_expression(working_set: &mut StateWorkingSet, spans: &[Span]) -> Ex
             let mut block = Block::default();
             let ty = output.ty.clone();
             block.pipelines = vec![Pipeline::from_vec(vec![output])];
+            block.span = Some(Span::concat(spans));
 
             compile_block(working_set, &mut block);
 
@@ -5393,9 +5394,19 @@ pub fn parse_builtin_commands(
     match name {
         b"def" => parse_def(working_set, lite_command, None).0,
         b"extern" => parse_extern(working_set, lite_command, None),
-        b"let" => parse_let(working_set, &lite_command.parts),
+        b"let" => parse_let(
+            working_set,
+            &lite_command
+                .parts_including_redirection()
+                .collect::<Vec<Span>>(),
+        ),
         b"const" => parse_const(working_set, &lite_command.parts),
-        b"mut" => parse_mut(working_set, &lite_command.parts),
+        b"mut" => parse_mut(
+            working_set,
+            &lite_command
+                .parts_including_redirection()
+                .collect::<Vec<Span>>(),
+        ),
         b"for" => {
             let expr = parse_for(working_set, lite_command);
             Pipeline::from_vec(vec![expr])
@@ -5647,169 +5658,73 @@ pub(crate) fn redirecting_builtin_error(
     }
 }
 
-pub fn parse_pipeline(
-    working_set: &mut StateWorkingSet,
-    pipeline: &LitePipeline,
-    is_subexpression: bool,
-    pipeline_index: usize,
-) -> Pipeline {
+pub fn parse_pipeline(working_set: &mut StateWorkingSet, pipeline: &LitePipeline) -> Pipeline {
+    let first_command = pipeline.commands.first();
+    let first_command_name = first_command
+        .and_then(|command| command.parts.first())
+        .map(|span| working_set.get_span_contents(*span));
+
     if pipeline.commands.len() > 1 {
-        // Special case: allow `let` and `mut` to consume the whole pipeline, eg) `let abc = "foo" | str length`
-        if let Some(&first) = pipeline.commands[0].parts.first() {
-            let first = working_set.get_span_contents(first);
-            if first == b"let" || first == b"mut" {
-                let name = if first == b"let" { "let" } else { "mut" };
-                let mut new_command = LiteCommand {
-                    comments: vec![],
-                    parts: pipeline.commands[0].parts.clone(),
-                    pipe: None,
-                    redirection: None,
-                };
+        // Special case: allow "let" or "mut" to consume the whole pipeline, if this is a pipeline
+        // with multiple commands
+        if matches!(first_command_name, Some(b"let" | b"mut")) {
+            // Merge the pipeline into one command
+            let first_command = first_command.expect("must be Some");
 
-                if let Some(redirection) = pipeline.commands[0].redirection.as_ref() {
-                    working_set.error(redirecting_builtin_error(name, redirection));
-                }
+            let remainder_span = first_command
+                .parts_including_redirection()
+                .skip(3)
+                .chain(
+                    pipeline.commands[1..]
+                        .iter()
+                        .flat_map(|command| command.parts_including_redirection()),
+                )
+                .reduce(Span::append);
 
-                for element in &pipeline.commands[1..] {
-                    if let Some(redirection) = pipeline.commands[0].redirection.as_ref() {
-                        working_set.error(redirecting_builtin_error(name, redirection));
-                    } else {
-                        new_command.parts.push(element.pipe.expect("pipe span"));
-                        new_command.comments.extend_from_slice(&element.comments);
-                        new_command.parts.extend_from_slice(&element.parts);
-                    }
-                }
+            let parts = first_command
+                .parts
+                .iter()
+                .take(3) // the let/mut start itself
+                .copied()
+                .chain(remainder_span) // everything else
+                .collect();
 
-                // if the 'let' is complete enough, use it, if not, fall through for now
-                if new_command.parts.len() > 3 {
-                    let rhs_span = Span::concat(&new_command.parts[3..]);
+            let comments = pipeline
+                .commands
+                .iter()
+                .flat_map(|command| command.comments.iter())
+                .copied()
+                .collect();
 
-                    new_command.parts.truncate(3);
-                    new_command.parts.push(rhs_span);
-
-                    let mut pipeline = parse_builtin_commands(working_set, &new_command);
-
-                    if pipeline_index == 0 {
-                        let let_decl_id = working_set.find_decl(b"let");
-                        let mut_decl_id = working_set.find_decl(b"mut");
-                        for element in pipeline.elements.iter_mut() {
-                            if let Expr::Call(call) = &element.expr.expr {
-                                if Some(call.decl_id) == let_decl_id
-                                    || Some(call.decl_id) == mut_decl_id
-                                {
-                                    // Do an expansion
-                                    if let Some(Expression {
-                                        expr: Expr::Block(block_id),
-                                        ..
-                                    }) = call.positional_iter().nth(1)
-                                    {
-                                        let block = working_set.get_block(*block_id);
-
-                                        if let Some(element) = block
-                                            .pipelines
-                                            .first()
-                                            .and_then(|p| p.elements.first())
-                                            .cloned()
-                                        {
-                                            if element.has_in_variable(working_set) {
-                                                let element = wrap_element_with_collect(
-                                                    working_set,
-                                                    &element,
-                                                );
-                                                let block = working_set.get_block_mut(*block_id);
-                                                block.pipelines[0].elements[0] = element;
-                                            }
-                                        }
-                                    }
-                                    continue;
-                                } else if element.has_in_variable(working_set) && !is_subexpression
-                                {
-                                    *element = wrap_element_with_collect(working_set, element);
-                                }
-                            } else if element.has_in_variable(working_set) && !is_subexpression {
-                                *element = wrap_element_with_collect(working_set, element);
-                            }
-                        }
-                    }
-
-                    return pipeline;
-                }
-            }
-        }
-
-        let mut elements = pipeline
-            .commands
-            .iter()
-            .map(|element| parse_pipeline_element(working_set, element))
-            .collect::<Vec<_>>();
-
-        if is_subexpression {
-            for element in elements.iter_mut().skip(1) {
-                if element.has_in_variable(working_set) {
-                    *element = wrap_element_with_collect(working_set, element);
-                }
-            }
+            let new_command = LiteCommand {
+                pipe: None,
+                comments,
+                parts,
+                redirection: None,
+            };
+            parse_builtin_commands(working_set, &new_command)
         } else {
-            for element in elements.iter_mut() {
-                if element.has_in_variable(working_set) {
-                    *element = wrap_element_with_collect(working_set, element);
-                }
-            }
-        }
-
-        Pipeline { elements }
-    } else {
-        if let Some(&first) = pipeline.commands[0].parts.first() {
-            let first = working_set.get_span_contents(first);
-            if first == b"let" || first == b"mut" {
-                if let Some(redirection) = pipeline.commands[0].redirection.as_ref() {
-                    let name = if first == b"let" { "let" } else { "mut" };
-                    working_set.error(redirecting_builtin_error(name, redirection));
-                }
-            }
-        }
-
-        let mut pipeline = parse_builtin_commands(working_set, &pipeline.commands[0]);
-
-        let let_decl_id = working_set.find_decl(b"let");
-        let mut_decl_id = working_set.find_decl(b"mut");
-
-        if pipeline_index == 0 {
-            for element in pipeline.elements.iter_mut() {
-                if let Expr::Call(call) = &element.expr.expr {
-                    if Some(call.decl_id) == let_decl_id || Some(call.decl_id) == mut_decl_id {
-                        // Do an expansion
-                        if let Some(Expression {
-                            expr: Expr::Block(block_id),
-                            ..
-                        }) = call.positional_iter().nth(1)
-                        {
-                            let block = working_set.get_block(*block_id);
-
-                            if let Some(element) = block
-                                .pipelines
-                                .first()
-                                .and_then(|p| p.elements.first())
-                                .cloned()
-                            {
-                                if element.has_in_variable(working_set) {
-                                    let element = wrap_element_with_collect(working_set, &element);
-                                    let block = working_set.get_block_mut(*block_id);
-                                    block.pipelines[0].elements[0] = element;
-                                }
-                            }
-                        }
-                        continue;
-                    } else if element.has_in_variable(working_set) && !is_subexpression {
-                        *element = wrap_element_with_collect(working_set, element);
+            // Parse a normal multi command pipeline
+            let elements: Vec<_> = pipeline
+                .commands
+                .iter()
+                .enumerate()
+                .map(|(index, element)| {
+                    let element = parse_pipeline_element(working_set, element);
+                    // Handle $in for pipeline elements beyond the first one
+                    if index > 0 && element.has_in_variable(working_set) {
+                        wrap_element_with_collect(working_set, element.clone())
+                    } else {
+                        element
                     }
-                } else if element.has_in_variable(working_set) && !is_subexpression {
-                    *element = wrap_element_with_collect(working_set, element);
-                }
-            }
-        }
+                })
+                .collect();
 
-        pipeline
+            Pipeline { elements }
+        }
+    } else {
+        // If there's only one command in the pipeline, this could be a builtin command
+        parse_builtin_commands(working_set, &pipeline.commands[0])
     }
 }
 
@@ -5840,18 +5755,45 @@ pub fn parse_block(
     }
 
     let mut block = Block::new_with_capacity(lite_block.block.len());
+    block.span = Some(span);
 
-    for (idx, lite_pipeline) in lite_block.block.iter().enumerate() {
-        let pipeline = parse_pipeline(working_set, lite_pipeline, is_subexpression, idx);
+    for lite_pipeline in &lite_block.block {
+        let pipeline = parse_pipeline(working_set, lite_pipeline);
         block.pipelines.push(pipeline);
     }
 
+    // If this is not a subexpression and there are any pipelines where the first element has $in,
+    // we can wrap the whole block in collect so that they all reference the same $in
+    if !is_subexpression
+        && block
+            .pipelines
+            .iter()
+            .flat_map(|pipeline| pipeline.elements.first())
+            .any(|element| element.has_in_variable(working_set))
+    {
+        // Move the block out to prepare it to become a subexpression
+        let inner_block = std::mem::take(&mut block);
+        block.span = inner_block.span;
+        let ty = inner_block.output_type();
+        let block_id = working_set.add_block(Arc::new(inner_block));
+
+        // Now wrap it in a Collect expression, and put it in the block as the only pipeline
+        let subexpression = Expression::new(working_set, Expr::Subexpression(block_id), span, ty);
+        let collect = wrap_expr_with_collect(working_set, subexpression);
+
+        block.pipelines.push(Pipeline {
+            elements: vec![PipelineElement {
+                pipe: None,
+                expr: collect,
+                redirection: None,
+            }],
+        });
+    }
+
     if scoped {
         working_set.exit_scope();
     }
 
-    block.span = Some(span);
-
     let errors = type_check::check_block_input_output(working_set, &block);
     if !errors.is_empty() {
         working_set.parse_errors.extend_from_slice(&errors);
@@ -6220,6 +6162,10 @@ pub fn discover_captures_in_expr(
                 discover_captures_in_expr(working_set, &match_.1, seen, seen_blocks, output)?;
             }
         }
+        Expr::Collect(var_id, expr) => {
+            seen.push(*var_id);
+            discover_captures_in_expr(working_set, expr, seen, seen_blocks, output)?
+        }
         Expr::RowCondition(block_id) | Expr::Subexpression(block_id) => {
             let block = working_set.get_block(*block_id);
 
@@ -6270,28 +6216,28 @@ pub fn discover_captures_in_expr(
 
 fn wrap_redirection_with_collect(
     working_set: &mut StateWorkingSet,
-    target: &RedirectionTarget,
+    target: RedirectionTarget,
 ) -> RedirectionTarget {
     match target {
         RedirectionTarget::File { expr, append, span } => RedirectionTarget::File {
             expr: wrap_expr_with_collect(working_set, expr),
-            span: *span,
-            append: *append,
+            span,
+            append,
         },
-        RedirectionTarget::Pipe { span } => RedirectionTarget::Pipe { span: *span },
+        RedirectionTarget::Pipe { span } => RedirectionTarget::Pipe { span },
     }
 }
 
 fn wrap_element_with_collect(
     working_set: &mut StateWorkingSet,
-    element: &PipelineElement,
+    element: PipelineElement,
 ) -> PipelineElement {
     PipelineElement {
         pipe: element.pipe,
-        expr: wrap_expr_with_collect(working_set, &element.expr),
-        redirection: element.redirection.as_ref().map(|r| match r {
+        expr: wrap_expr_with_collect(working_set, element.expr),
+        redirection: element.redirection.map(|r| match r {
             PipelineRedirection::Single { source, target } => PipelineRedirection::Single {
-                source: *source,
+                source,
                 target: wrap_redirection_with_collect(working_set, target),
             },
             PipelineRedirection::Separate { out, err } => PipelineRedirection::Separate {
@@ -6302,65 +6248,24 @@ fn wrap_element_with_collect(
     }
 }
 
-fn wrap_expr_with_collect(working_set: &mut StateWorkingSet, expr: &Expression) -> Expression {
+fn wrap_expr_with_collect(working_set: &mut StateWorkingSet, expr: Expression) -> Expression {
     let span = expr.span;
 
-    if let Some(decl_id) = working_set.find_decl(b"collect") {
-        let mut output = vec![];
+    // IN_VARIABLE_ID should get replaced with a unique variable, so that we don't have to
+    // execute as a closure
+    let var_id = working_set.add_variable(b"$in".into(), expr.span, Type::Any, false);
+    let mut expr = expr.clone();
+    expr.replace_in_variable(working_set, var_id);
 
-        let var_id = IN_VARIABLE_ID;
-        let mut signature = Signature::new("");
-        signature.required_positional.push(PositionalArg {
-            var_id: Some(var_id),
-            name: "$in".into(),
-            desc: String::new(),
-            shape: SyntaxShape::Any,
-            default_value: None,
-        });
-
-        let mut block = Block {
-            pipelines: vec![Pipeline::from_vec(vec![expr.clone()])],
-            signature: Box::new(signature),
-            ..Default::default()
-        };
-
-        compile_block(working_set, &mut block);
-
-        let block_id = working_set.add_block(Arc::new(block));
-
-        output.push(Argument::Positional(Expression::new(
-            working_set,
-            Expr::Closure(block_id),
-            span,
-            Type::Any,
-        )));
-
-        output.push(Argument::Named((
-            Spanned {
-                item: "keep-env".to_string(),
-                span: Span::new(0, 0),
-            },
-            None,
-            None,
-        )));
-
-        // The containing, synthetic call to `collect`.
-        // We don't want to have a real span as it will confuse flattening
-        // The args are where we'll get the real info
-        Expression::new(
-            working_set,
-            Expr::Call(Box::new(Call {
-                head: Span::new(0, 0),
-                arguments: output,
-                decl_id,
-                parser_info: HashMap::new(),
-            })),
-            span,
-            Type::Any,
-        )
-    } else {
-        Expression::garbage(working_set, span)
-    }
+    // Bind the custom `$in` variable for that particular expression
+    let ty = expr.ty.clone();
+    Expression::new(
+        working_set,
+        Expr::Collect(var_id, Box::new(expr)),
+        span,
+        // We can expect it to have the same result type
+        ty,
+    )
 }
 
 // Parses a vector of u8 to create an AST Block. If a file name is given, then
diff --git a/crates/nu-parser/tests/test_parser.rs b/crates/nu-parser/tests/test_parser.rs
index 7762863ba1..4d6778d201 100644
--- a/crates/nu-parser/tests/test_parser.rs
+++ b/crates/nu-parser/tests/test_parser.rs
@@ -41,6 +41,41 @@ impl Command for Let {
     }
 }
 
+#[cfg(test)]
+#[derive(Clone)]
+pub struct Mut;
+
+#[cfg(test)]
+impl Command for Mut {
+    fn name(&self) -> &str {
+        "mut"
+    }
+
+    fn usage(&self) -> &str {
+        "Mock mut command."
+    }
+
+    fn signature(&self) -> nu_protocol::Signature {
+        Signature::build("mut")
+            .required("var_name", SyntaxShape::VarWithOptType, "variable name")
+            .required(
+                "initial_value",
+                SyntaxShape::Keyword(b"=".to_vec(), Box::new(SyntaxShape::MathExpression)),
+                "equals sign followed by value",
+            )
+    }
+
+    fn run(
+        &self,
+        _engine_state: &EngineState,
+        _stack: &mut Stack,
+        _call: &Call,
+        _input: PipelineData,
+    ) -> Result<PipelineData, ShellError> {
+        todo!()
+    }
+}
+
 fn test_int(
     test_tag: &str,     // name of sub-test
     test: &[u8],        // input expression
@@ -1149,11 +1184,29 @@ fn test_nothing_comparison_eq() {
 fn test_redirection_with_letmut(#[case] phase: &[u8]) {
     let engine_state = EngineState::new();
     let mut working_set = StateWorkingSet::new(&engine_state);
-    let _block = parse(&mut working_set, None, phase, true);
-    assert!(matches!(
-        working_set.parse_errors.first(),
-        Some(ParseError::RedirectingBuiltinCommand(_, _, _))
-    ));
+    working_set.add_decl(Box::new(Let));
+    working_set.add_decl(Box::new(Mut));
+
+    let block = parse(&mut working_set, None, phase, true);
+    assert!(
+        working_set.parse_errors.is_empty(),
+        "parse errors: {:?}",
+        working_set.parse_errors
+    );
+    assert_eq!(1, block.pipelines[0].elements.len());
+
+    let element = &block.pipelines[0].elements[0];
+    assert!(element.redirection.is_none()); // it should be in the let block, not here
+
+    if let Expr::Call(call) = &element.expr.expr {
+        let arg = call.positional_nth(1).expect("no positional args");
+        let block_id = arg.as_block().expect("arg 1 is not a block");
+        let block = working_set.get_block(block_id);
+        let inner_element = &block.pipelines[0].elements[0];
+        assert!(inner_element.redirection.is_some());
+    } else {
+        panic!("expected Call: {:?}", block.pipelines[0].elements[0])
+    }
 }
 
 #[rstest]
diff --git a/crates/nu-protocol/src/ast/block.rs b/crates/nu-protocol/src/ast/block.rs
index 8f62ff99ba..a6e640cc15 100644
--- a/crates/nu-protocol/src/ast/block.rs
+++ b/crates/nu-protocol/src/ast/block.rs
@@ -78,6 +78,19 @@ impl Block {
             Type::Nothing
         }
     }
+
+    /// Replace any `$in` variables in the initial element of pipelines within the block
+    pub fn replace_in_variable(
+        &mut self,
+        working_set: &mut StateWorkingSet<'_>,
+        new_var_id: VarId,
+    ) {
+        for pipeline in self.pipelines.iter_mut() {
+            if let Some(element) = pipeline.elements.first_mut() {
+                element.replace_in_variable(working_set, new_var_id);
+            }
+        }
+    }
 }
 
 impl<T> From<T> for Block
diff --git a/crates/nu-protocol/src/ast/expr.rs b/crates/nu-protocol/src/ast/expr.rs
index fadbffc51f..d6386f92dc 100644
--- a/crates/nu-protocol/src/ast/expr.rs
+++ b/crates/nu-protocol/src/ast/expr.rs
@@ -25,6 +25,7 @@ pub enum Expr {
     RowCondition(BlockId),
     UnaryNot(Box<Expression>),
     BinaryOp(Box<Expression>, Box<Expression>, Box<Expression>), //lhs, op, rhs
+    Collect(VarId, Box<Expression>),
     Subexpression(BlockId),
     Block(BlockId),
     Closure(BlockId),
@@ -65,11 +66,13 @@ impl Expr {
         &self,
         working_set: &StateWorkingSet,
     ) -> (Option<OutDest>, Option<OutDest>) {
-        // Usages of `$in` will be wrapped by a `collect` call by the parser,
-        // so we do not have to worry about that when considering
-        // which of the expressions below may consume pipeline output.
         match self {
             Expr::Call(call) => working_set.get_decl(call.decl_id).pipe_redirection(),
+            Expr::Collect(_, _) => {
+                // A collect expression always has default redirection, it's just going to collect
+                // stdout unless another type of redirection is specified
+                (None, None)
+            },
             Expr::Subexpression(block_id) | Expr::Block(block_id) => working_set
                 .get_block(*block_id)
                 .pipe_redirection(working_set),
diff --git a/crates/nu-protocol/src/ast/expression.rs b/crates/nu-protocol/src/ast/expression.rs
index dfd4187a90..4818aa2db2 100644
--- a/crates/nu-protocol/src/ast/expression.rs
+++ b/crates/nu-protocol/src/ast/expression.rs
@@ -6,6 +6,8 @@ use crate::{
 use serde::{Deserialize, Serialize};
 use std::sync::Arc;
 
+use super::ListItem;
+
 /// Wrapper around [`Expr`]
 #[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
 pub struct Expression {
@@ -106,37 +108,14 @@ impl Expression {
                 left.has_in_variable(working_set) || right.has_in_variable(working_set)
             }
             Expr::UnaryNot(expr) => expr.has_in_variable(working_set),
-            Expr::Block(block_id) => {
+            Expr::Block(block_id) | Expr::Closure(block_id) => {
                 let block = working_set.get_block(*block_id);
-
-                if block.captures.contains(&IN_VARIABLE_ID) {
-                    return true;
-                }
-
-                if let Some(pipeline) = block.pipelines.first() {
-                    match pipeline.elements.first() {
-                        Some(element) => element.has_in_variable(working_set),
-                        None => false,
-                    }
-                } else {
-                    false
-                }
-            }
-            Expr::Closure(block_id) => {
-                let block = working_set.get_block(*block_id);
-
-                if block.captures.contains(&IN_VARIABLE_ID) {
-                    return true;
-                }
-
-                if let Some(pipeline) = block.pipelines.first() {
-                    match pipeline.elements.first() {
-                        Some(element) => element.has_in_variable(working_set),
-                        None => false,
-                    }
-                } else {
-                    false
-                }
+                block.captures.contains(&IN_VARIABLE_ID)
+                    || block
+                        .pipelines
+                        .iter()
+                        .flat_map(|pipeline| pipeline.elements.first())
+                        .any(|element| element.has_in_variable(working_set))
             }
             Expr::Binary(_) => false,
             Expr::Bool(_) => false,
@@ -251,6 +230,9 @@ impl Expression {
             Expr::Signature(_) => false,
             Expr::String(_) => false,
             Expr::RawString(_) => false,
+            // A `$in` variable found within a `Collect` is local, as it's already been wrapped
+            // This is probably unlikely to happen anyway - the expressions are wrapped depth-first
+            Expr::Collect(_, _) => false,
             Expr::RowCondition(block_id) | Expr::Subexpression(block_id) => {
                 let block = working_set.get_block(*block_id);
 
@@ -414,6 +396,7 @@ impl Expression {
                     i.replace_span(working_set, replaced, new_span)
                 }
             }
+            Expr::Collect(_, expr) => expr.replace_span(working_set, replaced, new_span),
             Expr::RowCondition(block_id) | Expr::Subexpression(block_id) => {
                 let mut block = (**working_set.get_block(*block_id)).clone();
 
@@ -443,6 +426,129 @@ impl Expression {
         }
     }
 
+    pub fn replace_in_variable(&mut self, working_set: &mut StateWorkingSet, new_var_id: VarId) {
+        match &mut self.expr {
+            Expr::Bool(_) => {}
+            Expr::Int(_) => {}
+            Expr::Float(_) => {}
+            Expr::Binary(_) => {}
+            Expr::Range(_) => {}
+            Expr::Var(var_id) | Expr::VarDecl(var_id) => {
+                if *var_id == IN_VARIABLE_ID {
+                    *var_id = new_var_id;
+                }
+            }
+            Expr::Call(call) => {
+                for arg in call.arguments.iter_mut() {
+                    match arg {
+                        Argument::Positional(expr)
+                        | Argument::Unknown(expr)
+                        | Argument::Named((_, _, Some(expr)))
+                        | Argument::Spread(expr) => {
+                            expr.replace_in_variable(working_set, new_var_id)
+                        }
+                        Argument::Named((_, _, None)) => {}
+                    }
+                }
+                for expr in call.parser_info.values_mut() {
+                    expr.replace_in_variable(working_set, new_var_id)
+                }
+            }
+            Expr::ExternalCall(head, args) => {
+                head.replace_in_variable(working_set, new_var_id);
+                for arg in args.iter_mut() {
+                    match arg {
+                        ExternalArgument::Regular(expr) | ExternalArgument::Spread(expr) => {
+                            expr.replace_in_variable(working_set, new_var_id)
+                        }
+                    }
+                }
+            }
+            Expr::Operator(_) => {}
+            // `$in` in `Collect` has already been handled, so we don't need to check further
+            Expr::Collect(_, _) => {}
+            Expr::Block(block_id)
+            | Expr::Closure(block_id)
+            | Expr::RowCondition(block_id)
+            | Expr::Subexpression(block_id) => {
+                let mut block = Block::clone(working_set.get_block(*block_id));
+                block.replace_in_variable(working_set, new_var_id);
+                *working_set.get_block_mut(*block_id) = block;
+            }
+            Expr::UnaryNot(expr) => {
+                expr.replace_in_variable(working_set, new_var_id);
+            }
+            Expr::BinaryOp(lhs, op, rhs) => {
+                for expr in [lhs, op, rhs] {
+                    expr.replace_in_variable(working_set, new_var_id);
+                }
+            }
+            Expr::MatchBlock(match_patterns) => {
+                for (_, expr) in match_patterns.iter_mut() {
+                    expr.replace_in_variable(working_set, new_var_id);
+                }
+            }
+            Expr::List(items) => {
+                for item in items.iter_mut() {
+                    match item {
+                        ListItem::Item(expr) | ListItem::Spread(_, expr) => {
+                            expr.replace_in_variable(working_set, new_var_id)
+                        }
+                    }
+                }
+            }
+            Expr::Table(table) => {
+                for col_expr in table.columns.iter_mut() {
+                    col_expr.replace_in_variable(working_set, new_var_id);
+                }
+                for row in table.rows.iter_mut() {
+                    for row_expr in row.iter_mut() {
+                        row_expr.replace_in_variable(working_set, new_var_id);
+                    }
+                }
+            }
+            Expr::Record(items) => {
+                for item in items.iter_mut() {
+                    match item {
+                        RecordItem::Pair(key, val) => {
+                            key.replace_in_variable(working_set, new_var_id);
+                            val.replace_in_variable(working_set, new_var_id);
+                        }
+                        RecordItem::Spread(_, expr) => {
+                            expr.replace_in_variable(working_set, new_var_id)
+                        }
+                    }
+                }
+            }
+            Expr::Keyword(kw) => kw.expr.replace_in_variable(working_set, new_var_id),
+            Expr::ValueWithUnit(value_with_unit) => value_with_unit
+                .expr
+                .replace_in_variable(working_set, new_var_id),
+            Expr::DateTime(_) => {}
+            Expr::Filepath(_, _) => {}
+            Expr::Directory(_, _) => {}
+            Expr::GlobPattern(_, _) => {}
+            Expr::String(_) => {}
+            Expr::RawString(_) => {}
+            Expr::CellPath(_) => {}
+            Expr::FullCellPath(full_cell_path) => {
+                full_cell_path
+                    .head
+                    .replace_in_variable(working_set, new_var_id);
+            }
+            Expr::ImportPattern(_) => {}
+            Expr::Overlay(_) => {}
+            Expr::Signature(_) => {}
+            Expr::StringInterpolation(exprs) | Expr::GlobInterpolation(exprs, _) => {
+                for expr in exprs.iter_mut() {
+                    expr.replace_in_variable(working_set, new_var_id);
+                }
+            }
+            Expr::Nothing => {}
+            Expr::Garbage => {}
+        }
+    }
+
     pub fn new(working_set: &mut StateWorkingSet, expr: Expr, span: Span, ty: Type) -> Expression {
         let span_id = working_set.add_span(span);
         Expression {
diff --git a/crates/nu-protocol/src/ast/pipeline.rs b/crates/nu-protocol/src/ast/pipeline.rs
index 3f2a216485..0ed02c700b 100644
--- a/crates/nu-protocol/src/ast/pipeline.rs
+++ b/crates/nu-protocol/src/ast/pipeline.rs
@@ -1,4 +1,4 @@
-use crate::{ast::Expression, engine::StateWorkingSet, OutDest, Span};
+use crate::{ast::Expression, engine::StateWorkingSet, OutDest, Span, VarId};
 use serde::{Deserialize, Serialize};
 use std::fmt::Display;
 
@@ -62,6 +62,19 @@ impl RedirectionTarget {
             RedirectionTarget::Pipe { .. } => {}
         }
     }
+
+    pub fn replace_in_variable(
+        &mut self,
+        working_set: &mut StateWorkingSet<'_>,
+        new_var_id: VarId,
+    ) {
+        match self {
+            RedirectionTarget::File { expr, .. } => {
+                expr.replace_in_variable(working_set, new_var_id)
+            }
+            RedirectionTarget::Pipe { .. } => {}
+        }
+    }
 }
 
 #[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
@@ -75,6 +88,23 @@ pub enum PipelineRedirection {
         err: RedirectionTarget,
     },
 }
+impl PipelineRedirection {
+    pub fn replace_in_variable(
+        &mut self,
+        working_set: &mut StateWorkingSet<'_>,
+        new_var_id: VarId,
+    ) {
+        match self {
+            PipelineRedirection::Single { source: _, target } => {
+                target.replace_in_variable(working_set, new_var_id)
+            }
+            PipelineRedirection::Separate { out, err } => {
+                out.replace_in_variable(working_set, new_var_id);
+                err.replace_in_variable(working_set, new_var_id);
+            }
+        }
+    }
+}
 
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct PipelineElement {
@@ -120,6 +150,17 @@ impl PipelineElement {
     ) -> (Option<OutDest>, Option<OutDest>) {
         self.expr.expr.pipe_redirection(working_set)
     }
+
+    pub fn replace_in_variable(
+        &mut self,
+        working_set: &mut StateWorkingSet<'_>,
+        new_var_id: VarId,
+    ) {
+        self.expr.replace_in_variable(working_set, new_var_id);
+        if let Some(redirection) = &mut self.redirection {
+            redirection.replace_in_variable(working_set, new_var_id);
+        }
+    }
 }
 
 #[derive(Debug, Clone, Serialize, Deserialize)]
diff --git a/crates/nu-protocol/src/debugger/profiler.rs b/crates/nu-protocol/src/debugger/profiler.rs
index c61ee04d12..ea81a1b814 100644
--- a/crates/nu-protocol/src/debugger/profiler.rs
+++ b/crates/nu-protocol/src/debugger/profiler.rs
@@ -343,6 +343,7 @@ fn expr_to_string(engine_state: &EngineState, expr: &Expr) -> String {
         Expr::String(_) | Expr::RawString(_) => "string".to_string(),
         Expr::StringInterpolation(_) => "string interpolation".to_string(),
         Expr::GlobInterpolation(_, _) => "glob interpolation".to_string(),
+        Expr::Collect(_, _) => "collect".to_string(),
         Expr::Subexpression(_) => "subexpression".to_string(),
         Expr::Table(_) => "table".to_string(),
         Expr::UnaryNot(_) => "unary not".to_string(),
diff --git a/crates/nu-protocol/src/eval_base.rs b/crates/nu-protocol/src/eval_base.rs
index f49d235482..933342f577 100644
--- a/crates/nu-protocol/src/eval_base.rs
+++ b/crates/nu-protocol/src/eval_base.rs
@@ -159,6 +159,9 @@ pub trait Eval {
             Expr::ExternalCall(head, args) => {
                 Self::eval_external_call(state, mut_state, head, args, expr_span)
             }
+            Expr::Collect(var_id, expr) => {
+                Self::eval_collect::<D>(state, mut_state, *var_id, expr)
+            }
             Expr::Subexpression(block_id) => {
                 Self::eval_subexpression::<D>(state, mut_state, *block_id, expr_span)
             }
@@ -356,6 +359,13 @@ pub trait Eval {
         span: Span,
     ) -> Result<Value, ShellError>;
 
+    fn eval_collect<D: DebugContext>(
+        state: Self::State<'_>,
+        mut_state: &mut Self::MutState,
+        var_id: VarId,
+        expr: &Expression,
+    ) -> Result<Value, ShellError>;
+
     fn eval_subexpression<D: DebugContext>(
         state: Self::State<'_>,
         mut_state: &mut Self::MutState,
diff --git a/crates/nu-protocol/src/eval_const.rs b/crates/nu-protocol/src/eval_const.rs
index cba2392338..cd4780773f 100644
--- a/crates/nu-protocol/src/eval_const.rs
+++ b/crates/nu-protocol/src/eval_const.rs
@@ -422,6 +422,15 @@ impl Eval for EvalConst {
         Err(ShellError::NotAConstant { span })
     }
 
+    fn eval_collect<D: DebugContext>(
+        _: &StateWorkingSet,
+        _: &mut (),
+        _var_id: VarId,
+        expr: &Expression,
+    ) -> Result<Value, ShellError> {
+        Err(ShellError::NotAConstant { span: expr.span })
+    }
+
     fn eval_subexpression<D: DebugContext>(
         working_set: &StateWorkingSet,
         _: &mut (),
diff --git a/crates/nu-protocol/src/ir/display.rs b/crates/nu-protocol/src/ir/display.rs
index c28323cca4..0026e15eab 100644
--- a/crates/nu-protocol/src/ir/display.rs
+++ b/crates/nu-protocol/src/ir/display.rs
@@ -102,6 +102,10 @@ impl<'a> fmt::Display for FmtInstruction<'a> {
                 let var = FmtVar::new(self.engine_state, *var_id);
                 write!(f, "{:WIDTH$} {var}, {src}", "store-variable")
             }
+            Instruction::DropVariable { var_id } => {
+                let var = FmtVar::new(self.engine_state, *var_id);
+                write!(f, "{:WIDTH$} {var}", "drop-variable")
+            }
             Instruction::LoadEnv { dst, key } => {
                 let key = FmtData(self.data, *key);
                 write!(f, "{:WIDTH$} {dst}, {key}", "load-env")
diff --git a/crates/nu-protocol/src/ir/mod.rs b/crates/nu-protocol/src/ir/mod.rs
index aff5f42c8c..bdded28c0e 100644
--- a/crates/nu-protocol/src/ir/mod.rs
+++ b/crates/nu-protocol/src/ir/mod.rs
@@ -127,6 +127,8 @@ pub enum Instruction {
     LoadVariable { dst: RegId, var_id: VarId },
     /// Store the value of a variable from the `src` register
     StoreVariable { var_id: VarId, src: RegId },
+    /// Remove a variable from the stack, freeing up whatever resources were associated with it
+    DropVariable { var_id: VarId },
     /// Load the value of an environment variable into the `dst` register
     LoadEnv { dst: RegId, key: DataSlice },
     /// Load the value of an environment variable into the `dst` register, or `Nothing` if it
@@ -290,6 +292,7 @@ impl Instruction {
             Instruction::Drain { .. } => None,
             Instruction::LoadVariable { dst, .. } => Some(dst),
             Instruction::StoreVariable { .. } => None,
+            Instruction::DropVariable { .. } => None,
             Instruction::LoadEnv { dst, .. } => Some(dst),
             Instruction::LoadEnvOpt { dst, .. } => Some(dst),
             Instruction::StoreEnv { .. } => None,
diff --git a/crates/nuon/src/from.rs b/crates/nuon/src/from.rs
index 2cfeaebc61..2138e50ac5 100644
--- a/crates/nuon/src/from.rs
+++ b/crates/nuon/src/from.rs
@@ -329,6 +329,12 @@ fn convert_to_value(
             msg: "glob interpolation not supported in nuon".into(),
             span: expr.span,
         }),
+        Expr::Collect(..) => Err(ShellError::OutsideSpannedLabeledError {
+            src: original_text.to_string(),
+            error: "Error when loading".into(),
+            msg: "`$in` not supported in nuon".into(),
+            span: expr.span,
+        }),
         Expr::Subexpression(..) => Err(ShellError::OutsideSpannedLabeledError {
             src: original_text.to_string(),
             error: "Error when loading".into(),
diff --git a/tests/repl/test_engine.rs b/tests/repl/test_engine.rs
index e452295f42..1b74e741f7 100644
--- a/tests/repl/test_engine.rs
+++ b/tests/repl/test_engine.rs
@@ -52,6 +52,29 @@ fn in_and_if_else() -> TestResult {
     )
 }
 
+#[test]
+fn in_with_closure() -> TestResult {
+    // Can use $in twice
+    run_test(r#"3 | do { let x = $in; let y = $in; $x + $y }"#, "6")
+}
+
+#[test]
+fn in_with_custom_command() -> TestResult {
+    // Can use $in twice
+    run_test(
+        r#"def foo [] { let x = $in; let y = $in; $x + $y }; 3 | foo"#,
+        "6",
+    )
+}
+
+#[test]
+fn in_used_twice_and_also_in_pipeline() -> TestResult {
+    run_test(
+        r#"3 | do { let x = $in; let y = $in; $x + $y | $in * 4 }"#,
+        "24",
+    )
+}
+
 #[test]
 fn help_works_with_missing_requirements() -> TestResult {
     run_test(r#"each --help | lines | length"#, "72")
diff --git a/tests/repl/test_type_check.rs b/tests/repl/test_type_check.rs
index 87216e6672..8a00f2f486 100644
--- a/tests/repl/test_type_check.rs
+++ b/tests/repl/test_type_check.rs
@@ -129,3 +129,16 @@ fn transpose_into_load_env() -> TestResult {
         "10",
     )
 }
+
+#[test]
+fn in_variable_expression_correct_output_type() -> TestResult {
+    run_test(r#"def foo []: nothing -> string { 'foo' | $"($in)" }"#, "")
+}
+
+#[test]
+fn in_variable_expression_wrong_output_type() -> TestResult {
+    fail_test(
+        r#"def foo []: nothing -> int { 'foo' | $"($in)" }"#,
+        "expected int",
+    )
+}

From 22379c9846713c3905bc3e4ef5f9187113f5a689 Mon Sep 17 00:00:00 2001
From: 132ikl <132@ikl.sh>
Date: Wed, 17 Jul 2024 19:52:47 -0400
Subject: [PATCH 02/14] Make default config more consistent (#13399)

<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

Fixes some minor config inconsistencies:
- Add default value for `shape_glob_interpolation` to light theme, as it
is missing
- Add right padding space to `shape_garbage` options
- Use a integer value for `footer_mode` instead of a string
- Set `buffer_editor` to null instead of empty string

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->
N/A

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->
N/A
# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
N/A
---
 crates/nu-utils/src/sample_config/default_config.nu | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/crates/nu-utils/src/sample_config/default_config.nu b/crates/nu-utils/src/sample_config/default_config.nu
index 5856e385c4..3c3e57fa95 100644
--- a/crates/nu-utils/src/sample_config/default_config.nu
+++ b/crates/nu-utils/src/sample_config/default_config.nu
@@ -47,7 +47,7 @@ let dark_theme = {
     shape_flag: blue_bold
     shape_float: purple_bold
     # shapes are used to change the cli syntax highlighting
-    shape_garbage: { fg: white bg: red attr: b}
+    shape_garbage: { fg: white bg: red attr: b }
     shape_glob_interpolation: cyan_bold
     shape_globpattern: cyan_bold
     shape_int: purple_bold
@@ -114,7 +114,8 @@ let light_theme = {
     shape_flag: blue_bold
     shape_float: purple_bold
     # shapes are used to change the cli syntax highlighting
-    shape_garbage: { fg: white bg: red attr: b}
+    shape_garbage: { fg: white bg: red attr: b }
+    shape_glob_interpolation: cyan_bold
     shape_globpattern: cyan_bold
     shape_int: purple_bold
     shape_internalcall: cyan_bold
@@ -226,9 +227,9 @@ $env.config = {
 
     color_config: $dark_theme # if you want a more interesting theme, you can replace the empty record with `$dark_theme`, `$light_theme` or another custom record
     use_grid_icons: true
-    footer_mode: "25" # always, never, number_of_rows, auto
+    footer_mode: 25 # always, never, number_of_rows, auto
     float_precision: 2 # the precision for displaying floats in tables
-    buffer_editor: "" # command that will be used to edit the current line buffer with ctrl+o, if unset fallback to $env.EDITOR and $env.VISUAL
+    buffer_editor: null # command that will be used to edit the current line buffer with ctrl+o, if unset fallback to $env.EDITOR and $env.VISUAL
     use_ansi_coloring: true
     bracketed_paste: true # enable bracketed paste, currently useless on windows
     edit_mode: emacs # emacs, vi
@@ -888,4 +889,4 @@ $env.config = {
             event: { edit: selectall }
         }
     ]
-}
\ No newline at end of file
+}

From c19944f2916e023ef2325331ebedd4d36be5a46e Mon Sep 17 00:00:00 2001
From: Ian Manske <ian.manske@pm.me>
Date: Fri, 19 Jul 2024 04:16:09 +0000
Subject: [PATCH 03/14] Refactor `window` (#13401)

# Description
Following from #13377, this PR refactors the related `window` command.
In the case where `window` should behave exactly like `chunks`, it now
reuses the same code that `chunks` does. Otherwise, the other cases have
been rewritten and have resulted in a performance increase:

| window size | stride | old time (ms) | new time (ms) |
| -----------:| ------:| -------------:| -------------:|
| 20          | 1      | 757           | 722           |
| 2           | 1      | 364           | 333           |
| 1           | 1      | 343           | 293           |
| 20          | 20     | 90            | 63            |
| 2           | 2      | 215           | 175           |
| 20          | 30     | 74            | 60            |
| 2           | 4      | 141           | 110           |


# User-Facing Changes
`window` will now error if the window size or stride is 0, which is
technically a breaking change.
---
 crates/nu-command/src/filters/chunks.rs    |  48 ++--
 crates/nu-command/src/filters/window.rs    | 307 +++++++++++----------
 crates/nu-command/tests/commands/mod.rs    |   1 +
 crates/nu-command/tests/commands/window.rs | 103 +++++++
 4 files changed, 297 insertions(+), 162 deletions(-)
 create mode 100644 crates/nu-command/tests/commands/window.rs

diff --git a/crates/nu-command/src/filters/chunks.rs b/crates/nu-command/src/filters/chunks.rs
index 11a59c2aa9..68d682ecd1 100644
--- a/crates/nu-command/src/filters/chunks.rs
+++ b/crates/nu-command/src/filters/chunks.rs
@@ -1,5 +1,6 @@
 use nu_engine::command_prelude::*;
 use nu_protocol::ListStream;
+use std::num::NonZeroUsize;
 
 #[derive(Clone)]
 pub struct Chunks;
@@ -89,26 +90,33 @@ impl Command for Chunks {
                 span: chunk_size.span(),
             })?;
 
-        if size == 0 {
-            return Err(ShellError::IncorrectValue {
-                msg: "`chunk_size` cannot be zero".into(),
-                val_span: chunk_size.span(),
-                call_span: head,
-            });
-        }
+        let size = NonZeroUsize::try_from(size).map_err(|_| ShellError::IncorrectValue {
+            msg: "`chunk_size` cannot be zero".into(),
+            val_span: chunk_size.span(),
+            call_span: head,
+        })?;
 
-        match input {
-            PipelineData::Value(Value::List { vals, .. }, metadata) => {
-                let chunks = ChunksIter::new(vals, size, head);
-                let stream = ListStream::new(chunks, head, engine_state.signals().clone());
-                Ok(PipelineData::ListStream(stream, metadata))
-            }
-            PipelineData::ListStream(stream, metadata) => {
-                let stream = stream.modify(|iter| ChunksIter::new(iter, size, head));
-                Ok(PipelineData::ListStream(stream, metadata))
-            }
-            input => Err(input.unsupported_input_error("list", head)),
+        chunks(engine_state, input, size, head)
+    }
+}
+
+pub fn chunks(
+    engine_state: &EngineState,
+    input: PipelineData,
+    chunk_size: NonZeroUsize,
+    span: Span,
+) -> Result<PipelineData, ShellError> {
+    match input {
+        PipelineData::Value(Value::List { vals, .. }, metadata) => {
+            let chunks = ChunksIter::new(vals, chunk_size, span);
+            let stream = ListStream::new(chunks, span, engine_state.signals().clone());
+            Ok(PipelineData::ListStream(stream, metadata))
         }
+        PipelineData::ListStream(stream, metadata) => {
+            let stream = stream.modify(|iter| ChunksIter::new(iter, chunk_size, span));
+            Ok(PipelineData::ListStream(stream, metadata))
+        }
+        input => Err(input.unsupported_input_error("list", span)),
     }
 }
 
@@ -119,10 +127,10 @@ struct ChunksIter<I: Iterator<Item = Value>> {
 }
 
 impl<I: Iterator<Item = Value>> ChunksIter<I> {
-    fn new(iter: impl IntoIterator<IntoIter = I>, size: usize, span: Span) -> Self {
+    fn new(iter: impl IntoIterator<IntoIter = I>, size: NonZeroUsize, span: Span) -> Self {
         Self {
             iter: iter.into_iter(),
-            size,
+            size: size.into(),
             span,
         }
     }
diff --git a/crates/nu-command/src/filters/window.rs b/crates/nu-command/src/filters/window.rs
index 3c470f478d..556543118b 100644
--- a/crates/nu-command/src/filters/window.rs
+++ b/crates/nu-command/src/filters/window.rs
@@ -1,5 +1,6 @@
 use nu_engine::command_prelude::*;
-use nu_protocol::ValueIterator;
+use nu_protocol::ListStream;
+use std::num::NonZeroUsize;
 
 #[derive(Clone)]
 pub struct Window;
@@ -12,8 +13,8 @@ impl Command for Window {
     fn signature(&self) -> Signature {
         Signature::build("window")
             .input_output_types(vec![(
-                Type::List(Box::new(Type::Any)),
-                Type::List(Box::new(Type::List(Box::new(Type::Any)))),
+                Type::list(Type::Any),
+                Type::list(Type::list(Type::Any)),
             )])
             .required("window_size", SyntaxShape::Int, "The size of each window.")
             .named(
@@ -34,72 +35,41 @@ impl Command for Window {
         "Creates a sliding window of `window_size` that slide by n rows/elements across input."
     }
 
+    fn extra_usage(&self) -> &str {
+        "This command will error if `window_size` or `stride` are negative or zero."
+    }
+
     fn examples(&self) -> Vec<Example> {
-        let stream_test_1 = vec![
-            Value::list(
-                vec![Value::test_int(1), Value::test_int(2)],
-                Span::test_data(),
-            ),
-            Value::list(
-                vec![Value::test_int(2), Value::test_int(3)],
-                Span::test_data(),
-            ),
-            Value::list(
-                vec![Value::test_int(3), Value::test_int(4)],
-                Span::test_data(),
-            ),
-        ];
-
-        let stream_test_2 = vec![
-            Value::list(
-                vec![Value::test_int(1), Value::test_int(2)],
-                Span::test_data(),
-            ),
-            Value::list(
-                vec![Value::test_int(4), Value::test_int(5)],
-                Span::test_data(),
-            ),
-            Value::list(
-                vec![Value::test_int(7), Value::test_int(8)],
-                Span::test_data(),
-            ),
-        ];
-
-        let stream_test_3 = vec![
-            Value::list(
-                vec![Value::test_int(1), Value::test_int(2), Value::test_int(3)],
-                Span::test_data(),
-            ),
-            Value::list(
-                vec![Value::test_int(4), Value::test_int(5)],
-                Span::test_data(),
-            ),
-        ];
-
         vec![
             Example {
                 example: "[1 2 3 4] | window 2",
                 description: "A sliding window of two elements",
-                result: Some(Value::list(
-                    stream_test_1,
-                    Span::test_data(),
-                )),
+                result: Some(Value::test_list(vec![
+                    Value::test_list(vec![Value::test_int(1), Value::test_int(2)]),
+                    Value::test_list(vec![Value::test_int(2), Value::test_int(3)]),
+                    Value::test_list(vec![Value::test_int(3), Value::test_int(4)]),
+                ])),
             },
             Example {
                 example: "[1, 2, 3, 4, 5, 6, 7, 8] | window 2 --stride 3",
                 description: "A sliding window of two elements, with a stride of 3",
-                result: Some(Value::list(
-                    stream_test_2,
-                    Span::test_data(),
-                )),
+                result: Some(Value::test_list(vec![
+                    Value::test_list(vec![Value::test_int(1), Value::test_int(2)]),
+                    Value::test_list(vec![Value::test_int(4), Value::test_int(5)]),
+                    Value::test_list(vec![Value::test_int(7), Value::test_int(8)]),
+                ])),
             },
             Example {
                 example: "[1, 2, 3, 4, 5] | window 3 --stride 3 --remainder",
                 description: "A sliding window of equal stride that includes remainder. Equivalent to chunking",
-                result: Some(Value::list(
-                    stream_test_3,
-                    Span::test_data(),
-                )),
+                result: Some(Value::test_list(vec![
+                    Value::test_list(vec![
+                        Value::test_int(1),
+                        Value::test_int(2),
+                        Value::test_int(3),
+                    ]),
+                    Value::test_list(vec![Value::test_int(4), Value::test_int(5)]),
+                ])),
             },
         ]
     }
@@ -112,116 +82,169 @@ impl Command for Window {
         input: PipelineData,
     ) -> Result<PipelineData, ShellError> {
         let head = call.head;
-        let group_size: Spanned<usize> = call.req(engine_state, stack, 0)?;
-        let metadata = input.metadata();
-        let stride: Option<usize> = call.get_flag(engine_state, stack, "stride")?;
+        let window_size: Value = call.req(engine_state, stack, 0)?;
+        let stride: Option<Value> = call.get_flag(engine_state, stack, "stride")?;
         let remainder = call.has_flag(engine_state, stack, "remainder")?;
 
-        let stride = stride.unwrap_or(1);
+        let size =
+            usize::try_from(window_size.as_int()?).map_err(|_| ShellError::NeedsPositiveValue {
+                span: window_size.span(),
+            })?;
 
-        //FIXME: add in support for external redirection when engine-q supports it generally
+        let size = NonZeroUsize::try_from(size).map_err(|_| ShellError::IncorrectValue {
+            msg: "`window_size` cannot be zero".into(),
+            val_span: window_size.span(),
+            call_span: head,
+        })?;
 
-        let each_group_iterator = EachWindowIterator {
-            group_size: group_size.item,
-            input: Box::new(input.into_iter()),
-            span: head,
-            previous: None,
-            stride,
-            remainder,
+        let stride = if let Some(stride_val) = stride {
+            let stride = usize::try_from(stride_val.as_int()?).map_err(|_| {
+                ShellError::NeedsPositiveValue {
+                    span: stride_val.span(),
+                }
+            })?;
+
+            NonZeroUsize::try_from(stride).map_err(|_| ShellError::IncorrectValue {
+                msg: "`stride` cannot be zero".into(),
+                val_span: stride_val.span(),
+                call_span: head,
+            })?
+        } else {
+            NonZeroUsize::MIN
         };
 
-        Ok(each_group_iterator.into_pipeline_data_with_metadata(
-            head,
-            engine_state.signals().clone(),
-            metadata,
-        ))
+        if remainder && size == stride {
+            super::chunks::chunks(engine_state, input, size, head)
+        } else if stride >= size {
+            match input {
+                PipelineData::Value(Value::List { vals, .. }, metadata) => {
+                    let chunks = WindowGapIter::new(vals, size, stride, remainder, head);
+                    let stream = ListStream::new(chunks, head, engine_state.signals().clone());
+                    Ok(PipelineData::ListStream(stream, metadata))
+                }
+                PipelineData::ListStream(stream, metadata) => {
+                    let stream = stream
+                        .modify(|iter| WindowGapIter::new(iter, size, stride, remainder, head));
+                    Ok(PipelineData::ListStream(stream, metadata))
+                }
+                input => Err(input.unsupported_input_error("list", head)),
+            }
+        } else {
+            match input {
+                PipelineData::Value(Value::List { vals, .. }, metadata) => {
+                    let chunks = WindowOverlapIter::new(vals, size, stride, remainder, head);
+                    let stream = ListStream::new(chunks, head, engine_state.signals().clone());
+                    Ok(PipelineData::ListStream(stream, metadata))
+                }
+                PipelineData::ListStream(stream, metadata) => {
+                    let stream = stream
+                        .modify(|iter| WindowOverlapIter::new(iter, size, stride, remainder, head));
+                    Ok(PipelineData::ListStream(stream, metadata))
+                }
+                input => Err(input.unsupported_input_error("list", head)),
+            }
+        }
     }
 }
 
-struct EachWindowIterator {
-    group_size: usize,
-    input: ValueIterator,
-    span: Span,
-    previous: Option<Vec<Value>>,
+struct WindowOverlapIter<I: Iterator<Item = Value>> {
+    iter: I,
+    window: Vec<Value>,
     stride: usize,
     remainder: bool,
+    span: Span,
 }
 
-impl Iterator for EachWindowIterator {
+impl<I: Iterator<Item = Value>> WindowOverlapIter<I> {
+    fn new(
+        iter: impl IntoIterator<IntoIter = I>,
+        size: NonZeroUsize,
+        stride: NonZeroUsize,
+        remainder: bool,
+        span: Span,
+    ) -> Self {
+        Self {
+            iter: iter.into_iter(),
+            window: Vec::with_capacity(size.into()),
+            stride: stride.into(),
+            remainder,
+            span,
+        }
+    }
+}
+
+impl<I: Iterator<Item = Value>> Iterator for WindowOverlapIter<I> {
     type Item = Value;
 
     fn next(&mut self) -> Option<Self::Item> {
-        let mut group = self.previous.take().unwrap_or_else(|| {
-            let mut vec = Vec::new();
-
-            // We default to a Vec of capacity size + stride as striding pushes n extra elements to the end
-            vec.try_reserve(self.group_size + self.stride).ok();
-
-            vec
-        });
-        let mut current_count = 0;
-
-        if group.is_empty() {
-            loop {
-                let item = self.input.next();
-
-                match item {
-                    Some(v) => {
-                        group.push(v);
-
-                        current_count += 1;
-                        if current_count >= self.group_size {
-                            break;
-                        }
-                    }
-                    None => {
-                        if self.remainder {
-                            break;
-                        } else {
-                            return None;
-                        }
-                    }
-                }
-            }
+        let len = if self.window.is_empty() {
+            self.window.capacity()
         } else {
-            // our historic buffer is already full, so stride instead
+            self.stride
+        };
 
-            loop {
-                let item = self.input.next();
+        self.window.extend((&mut self.iter).take(len));
 
-                match item {
-                    Some(v) => {
-                        group.push(v);
-
-                        current_count += 1;
-                        if current_count >= self.stride {
-                            break;
-                        }
-                    }
-                    None => {
-                        if self.remainder {
-                            break;
-                        } else {
-                            return None;
-                        }
-                    }
-                }
-            }
-
-            // We now have elements + stride in our group, and need to
-            // drop the skipped elements. Drain to preserve allocation and capacity
-            // Dropping this iterator consumes it.
-            group.drain(..self.stride.min(group.len()));
+        if self.window.len() == self.window.capacity()
+            || (self.remainder && !self.window.is_empty())
+        {
+            let mut next = Vec::with_capacity(self.window.len());
+            next.extend(self.window.iter().skip(self.stride).cloned());
+            let window = std::mem::replace(&mut self.window, next);
+            Some(Value::list(window, self.span))
+        } else {
+            None
         }
+    }
+}
 
-        if group.is_empty() {
-            return None;
+struct WindowGapIter<I: Iterator<Item = Value>> {
+    iter: I,
+    size: usize,
+    skip: usize,
+    first: bool,
+    remainder: bool,
+    span: Span,
+}
+
+impl<I: Iterator<Item = Value>> WindowGapIter<I> {
+    fn new(
+        iter: impl IntoIterator<IntoIter = I>,
+        size: NonZeroUsize,
+        stride: NonZeroUsize,
+        remainder: bool,
+        span: Span,
+    ) -> Self {
+        let size = size.into();
+        Self {
+            iter: iter.into_iter(),
+            size,
+            skip: stride.get() - size,
+            first: true,
+            remainder,
+            span,
         }
+    }
+}
 
-        let return_group = group.clone();
-        self.previous = Some(group);
+impl<I: Iterator<Item = Value>> Iterator for WindowGapIter<I> {
+    type Item = Value;
 
-        Some(Value::list(return_group, self.span))
+    fn next(&mut self) -> Option<Self::Item> {
+        let mut window = Vec::with_capacity(self.size);
+        window.extend(
+            (&mut self.iter)
+                .skip(if self.first { 0 } else { self.skip })
+                .take(self.size),
+        );
+
+        self.first = false;
+
+        if window.len() == self.size || (self.remainder && !window.is_empty()) {
+            Some(Value::list(window, self.span))
+        } else {
+            None
+        }
     }
 }
 
diff --git a/crates/nu-command/tests/commands/mod.rs b/crates/nu-command/tests/commands/mod.rs
index 4c9f8e86f8..3cb468ad30 100644
--- a/crates/nu-command/tests/commands/mod.rs
+++ b/crates/nu-command/tests/commands/mod.rs
@@ -115,6 +115,7 @@ mod try_;
 mod ucp;
 #[cfg(unix)]
 mod ulimit;
+mod window;
 
 mod debug;
 mod umkdir;
diff --git a/crates/nu-command/tests/commands/window.rs b/crates/nu-command/tests/commands/window.rs
new file mode 100644
index 0000000000..6e4addc6c0
--- /dev/null
+++ b/crates/nu-command/tests/commands/window.rs
@@ -0,0 +1,103 @@
+use nu_test_support::nu;
+
+#[test]
+fn window_size_negative() {
+    let actual = nu!("[0 1 2] | window -1");
+    assert!(actual.err.contains("positive"));
+}
+
+#[test]
+fn window_size_zero() {
+    let actual = nu!("[0 1 2] | window 0");
+    assert!(actual.err.contains("zero"));
+}
+
+#[test]
+fn window_size_not_int() {
+    let actual = nu!("[0 1 2] | window (if true { 1sec })");
+    assert!(actual.err.contains("can't convert"));
+}
+
+#[test]
+fn stride_negative() {
+    let actual = nu!("[0 1 2] | window 1 -s -1");
+    assert!(actual.err.contains("positive"));
+}
+
+#[test]
+fn stride_zero() {
+    let actual = nu!("[0 1 2] | window 1 -s 0");
+    assert!(actual.err.contains("zero"));
+}
+
+#[test]
+fn stride_not_int() {
+    let actual = nu!("[0 1 2] | window 1 -s (if true { 1sec })");
+    assert!(actual.err.contains("can't convert"));
+}
+
+#[test]
+fn empty() {
+    let actual = nu!("[] | window 2 | is-empty");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn list_stream() {
+    let actual = nu!("([0 1 2] | every 1 | window 2) == ([0 1 2] | window 2)");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn table_stream() {
+    let actual = nu!("([[foo bar]; [0 1] [2 3] [4 5]] | every 1 | window 2) == ([[foo bar]; [0 1] [2 3] [4 5]] | window 2)");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn no_empty_chunks() {
+    let actual = nu!("([0 1 2 3 4 5] | window 3 -s 3 -r | length) == 2");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn same_as_chunks() {
+    let actual = nu!("([0 1 2 3 4] | window 2 -s 2 -r) == ([0 1 2 3 4 ] | chunks 2)");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_equal_to_window_size() {
+    let actual = nu!("([0 1 2 3] | window 2 -s 2 | flatten) == [0 1 2 3]");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_greater_than_window_size() {
+    let actual = nu!("([0 1 2 3 4] | window 2 -s 3 | flatten) == [0 1 3 4]");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_less_than_window_size() {
+    let actual = nu!("([0 1 2 3 4 5] | window 3 -s 2 | length) == 2");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_equal_to_window_size_remainder() {
+    let actual = nu!("([0 1 2 3 4] | window 2 -s 2 -r | flatten) == [0 1 2 3 4]");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_greater_than_window_size_remainder() {
+    let actual = nu!("([0 1 2 3 4] | window 2 -s 3 -r | flatten) == [0 1 3 4]");
+    assert_eq!(actual.out, "true");
+}
+
+#[test]
+fn stride_less_than_window_size_remainder() {
+    let actual = nu!("([0 1 2 3 4 5] | window 3 -s 2 -r | length) == 3");
+    assert_eq!(actual.out, "true");
+}

From f3843a61762ab4054608b2b0393a322ee01dc78d Mon Sep 17 00:00:00 2001
From: Devyn Cairns <devyn.cairns@gmail.com>
Date: Thu, 18 Jul 2024 22:54:21 -0700
Subject: [PATCH 04/14] Make plugins able to find and call other commands
 (#13407)

# Description

Adds functionality to the plugin interface to support calling internal
commands from plugins. For example, using `view ir --json`:

```rust
let closure: Value = call.req(0)?;

let Some(decl_id) = engine.find_decl("view ir")? else {
    return Err(LabeledError::new("`view ir` not found"));
};

let ir_json = engine.call_decl(
    decl_id,
    EvaluatedCall::new(call.head)
        .with_named("json".into_spanned(call.head), Value::bool(true, call.head))
        .with_positional(closure),
    PipelineData::Empty,
    true,
    false,
)?.into_value()?.into_string()?;

let ir = serde_json::from_value(&ir_json);

// ...
```

# User-Facing Changes

Plugin developers can now use `EngineInterface::find_decl()` and
`call_decl()` to call internal commands, which could be handy for
formatters like `to csv` or `to nuon`, or for reflection commands that
help gain insight into the engine.

# Tests + Formatting
- :green_circle: `toolkit fmt`
- :green_circle: `toolkit clippy`
- :green_circle: `toolkit test`
- :green_circle: `toolkit test stdlib`

# After Submitting
- [ ] release notes
- [ ] update plugin protocol documentation: `FindDecl`, `CallDecl`
engine calls; `Identifier` engine call response
---
 crates/nu-plugin-engine/src/context.rs        | 103 +++++++++++++++---
 crates/nu-plugin-engine/src/interface/mod.rs  |  16 +++
 .../nu-plugin-protocol/src/evaluated_call.rs  |  86 +++++++++++++++
 crates/nu-plugin-protocol/src/lib.rs          |  35 +++++-
 crates/nu-plugin/src/plugin/interface/mod.rs  |  73 ++++++++++++-
 crates/nu-protocol/src/ir/call.rs             |   2 +-
 .../src/commands/call_decl.rs                 |  78 +++++++++++++
 crates/nu_plugin_example/src/commands/mod.rs  |   2 +
 crates/nu_plugin_example/src/lib.rs           |   1 +
 tests/plugins/call_decl.rs                    |  42 +++++++
 tests/plugins/mod.rs                          |   1 +
 11 files changed, 418 insertions(+), 21 deletions(-)
 create mode 100644 crates/nu_plugin_example/src/commands/call_decl.rs
 create mode 100644 tests/plugins/call_decl.rs

diff --git a/crates/nu-plugin-engine/src/context.rs b/crates/nu-plugin-engine/src/context.rs
index aa504a246a..d9023cde35 100644
--- a/crates/nu-plugin-engine/src/context.rs
+++ b/crates/nu-plugin-engine/src/context.rs
@@ -1,9 +1,10 @@
 use crate::util::MutableCow;
 use nu_engine::{get_eval_block_with_early_return, get_full_help, ClosureEvalOnce};
+use nu_plugin_protocol::EvaluatedCall;
 use nu_protocol::{
     engine::{Call, Closure, EngineState, Redirection, Stack},
-    Config, IntoSpanned, OutDest, PipelineData, PluginIdentity, ShellError, Signals, Span, Spanned,
-    Value,
+    ir, Config, DeclId, IntoSpanned, OutDest, PipelineData, PluginIdentity, ShellError, Signals,
+    Span, Spanned, Value,
 };
 use std::{
     borrow::Cow,
@@ -44,6 +45,17 @@ pub trait PluginExecutionContext: Send + Sync {
         redirect_stdout: bool,
         redirect_stderr: bool,
     ) -> Result<PipelineData, ShellError>;
+    /// Find a declaration by name
+    fn find_decl(&self, name: &str) -> Result<Option<DeclId>, ShellError>;
+    /// Call a declaration with arguments and input
+    fn call_decl(
+        &mut self,
+        decl_id: DeclId,
+        call: EvaluatedCall,
+        input: PipelineData,
+        redirect_stdout: bool,
+        redirect_stderr: bool,
+    ) -> Result<PipelineData, ShellError>;
     /// Create an owned version of the context with `'static` lifetime
     fn boxed(&self) -> Box<dyn PluginExecutionContext>;
 }
@@ -177,19 +189,10 @@ impl<'a> PluginExecutionContext for PluginExecutionCommandContext<'a> {
             .captures_to_stack(closure.item.captures)
             .reset_pipes();
 
-        let stdout = if redirect_stdout {
-            Some(Redirection::Pipe(OutDest::Capture))
-        } else {
-            None
-        };
-
-        let stderr = if redirect_stderr {
-            Some(Redirection::Pipe(OutDest::Capture))
-        } else {
-            None
-        };
-
-        let stack = &mut stack.push_redirection(stdout, stderr);
+        let stack = &mut stack.push_redirection(
+            redirect_stdout.then_some(Redirection::Pipe(OutDest::Capture)),
+            redirect_stderr.then_some(Redirection::Pipe(OutDest::Capture)),
+        );
 
         // Set up the positional arguments
         for (idx, value) in positional.into_iter().enumerate() {
@@ -211,6 +214,57 @@ impl<'a> PluginExecutionContext for PluginExecutionCommandContext<'a> {
         eval_block_with_early_return(&self.engine_state, stack, block, input)
     }
 
+    fn find_decl(&self, name: &str) -> Result<Option<DeclId>, ShellError> {
+        Ok(self.engine_state.find_decl(name.as_bytes(), &[]))
+    }
+
+    fn call_decl(
+        &mut self,
+        decl_id: DeclId,
+        call: EvaluatedCall,
+        input: PipelineData,
+        redirect_stdout: bool,
+        redirect_stderr: bool,
+    ) -> Result<PipelineData, ShellError> {
+        if decl_id >= self.engine_state.num_decls() {
+            return Err(ShellError::GenericError {
+                error: "Plugin misbehaving".into(),
+                msg: format!("Tried to call unknown decl id: {}", decl_id),
+                span: Some(call.head),
+                help: None,
+                inner: vec![],
+            });
+        }
+
+        let decl = self.engine_state.get_decl(decl_id);
+
+        let stack = &mut self.stack.push_redirection(
+            redirect_stdout.then_some(Redirection::Pipe(OutDest::Capture)),
+            redirect_stderr.then_some(Redirection::Pipe(OutDest::Capture)),
+        );
+
+        let mut call_builder = ir::Call::build(decl_id, call.head);
+
+        for positional in call.positional {
+            call_builder.add_positional(stack, positional.span(), positional);
+        }
+
+        for (name, value) in call.named {
+            if let Some(value) = value {
+                call_builder.add_named(stack, &name.item, "", name.span, value);
+            } else {
+                call_builder.add_flag(stack, &name.item, "", name.span);
+            }
+        }
+
+        decl.run(
+            &self.engine_state,
+            stack,
+            &(&call_builder.finish()).into(),
+            input,
+        )
+    }
+
     fn boxed(&self) -> Box<dyn PluginExecutionContext + 'static> {
         Box::new(PluginExecutionCommandContext {
             identity: self.identity.clone(),
@@ -298,6 +352,25 @@ impl PluginExecutionContext for PluginExecutionBogusContext {
         })
     }
 
+    fn find_decl(&self, _name: &str) -> Result<Option<DeclId>, ShellError> {
+        Err(ShellError::NushellFailed {
+            msg: "find_decl not implemented on bogus".into(),
+        })
+    }
+
+    fn call_decl(
+        &mut self,
+        _decl_id: DeclId,
+        _call: EvaluatedCall,
+        _input: PipelineData,
+        _redirect_stdout: bool,
+        _redirect_stderr: bool,
+    ) -> Result<PipelineData, ShellError> {
+        Err(ShellError::NushellFailed {
+            msg: "call_decl not implemented on bogus".into(),
+        })
+    }
+
     fn boxed(&self) -> Box<dyn PluginExecutionContext + 'static> {
         Box::new(PluginExecutionBogusContext)
     }
diff --git a/crates/nu-plugin-engine/src/interface/mod.rs b/crates/nu-plugin-engine/src/interface/mod.rs
index d1b722dedf..680b3b123f 100644
--- a/crates/nu-plugin-engine/src/interface/mod.rs
+++ b/crates/nu-plugin-engine/src/interface/mod.rs
@@ -1316,6 +1316,22 @@ pub(crate) fn handle_engine_call(
         } => context
             .eval_closure(closure, positional, input, redirect_stdout, redirect_stderr)
             .map(EngineCallResponse::PipelineData),
+        EngineCall::FindDecl(name) => context.find_decl(&name).map(|decl_id| {
+            if let Some(decl_id) = decl_id {
+                EngineCallResponse::Identifier(decl_id)
+            } else {
+                EngineCallResponse::empty()
+            }
+        }),
+        EngineCall::CallDecl {
+            decl_id,
+            call,
+            input,
+            redirect_stdout,
+            redirect_stderr,
+        } => context
+            .call_decl(decl_id, call, input, redirect_stdout, redirect_stderr)
+            .map(EngineCallResponse::PipelineData),
     }
 }
 
diff --git a/crates/nu-plugin-protocol/src/evaluated_call.rs b/crates/nu-plugin-protocol/src/evaluated_call.rs
index 58f8987865..5e760693a5 100644
--- a/crates/nu-plugin-protocol/src/evaluated_call.rs
+++ b/crates/nu-plugin-protocol/src/evaluated_call.rs
@@ -27,6 +27,82 @@ pub struct EvaluatedCall {
 }
 
 impl EvaluatedCall {
+    /// Create a new [`EvaluatedCall`] with the given head span.
+    pub fn new(head: Span) -> EvaluatedCall {
+        EvaluatedCall {
+            head,
+            positional: vec![],
+            named: vec![],
+        }
+    }
+
+    /// Add a positional argument to an [`EvaluatedCall`].
+    ///
+    /// # Example
+    ///
+    /// ```rust
+    /// # use nu_protocol::{Value, Span, IntoSpanned};
+    /// # use nu_plugin_protocol::EvaluatedCall;
+    /// # let head = Span::test_data();
+    /// let mut call = EvaluatedCall::new(head);
+    /// call.add_positional(Value::test_int(1337));
+    /// ```
+    pub fn add_positional(&mut self, value: Value) -> &mut Self {
+        self.positional.push(value);
+        self
+    }
+
+    /// Add a named argument to an [`EvaluatedCall`].
+    ///
+    /// # Example
+    ///
+    /// ```rust
+    /// # use nu_protocol::{Value, Span, IntoSpanned};
+    /// # use nu_plugin_protocol::EvaluatedCall;
+    /// # let head = Span::test_data();
+    /// let mut call = EvaluatedCall::new(head);
+    /// call.add_named("foo".into_spanned(head), Value::test_string("bar"));
+    /// ```
+    pub fn add_named(&mut self, name: Spanned<impl Into<String>>, value: Value) -> &mut Self {
+        self.named.push((name.map(Into::into), Some(value)));
+        self
+    }
+
+    /// Add a flag argument to an [`EvaluatedCall`]. A flag argument is a named argument with no
+    /// value.
+    ///
+    /// # Example
+    ///
+    /// ```rust
+    /// # use nu_protocol::{Value, Span, IntoSpanned};
+    /// # use nu_plugin_protocol::EvaluatedCall;
+    /// # let head = Span::test_data();
+    /// let mut call = EvaluatedCall::new(head);
+    /// call.add_flag("pretty".into_spanned(head));
+    /// ```
+    pub fn add_flag(&mut self, name: Spanned<impl Into<String>>) -> &mut Self {
+        self.named.push((name.map(Into::into), None));
+        self
+    }
+
+    /// Builder variant of [`.add_positional()`].
+    pub fn with_positional(mut self, value: Value) -> Self {
+        self.add_positional(value);
+        self
+    }
+
+    /// Builder variant of [`.add_named()`].
+    pub fn with_named(mut self, name: Spanned<impl Into<String>>, value: Value) -> Self {
+        self.add_named(name, value);
+        self
+    }
+
+    /// Builder variant of [`.add_flag()`].
+    pub fn with_flag(mut self, name: Spanned<impl Into<String>>) -> Self {
+        self.add_flag(name);
+        self
+    }
+
     /// Try to create an [`EvaluatedCall`] from a command `Call`.
     pub fn try_from_call(
         call: &Call,
@@ -192,6 +268,16 @@ impl EvaluatedCall {
         Ok(false)
     }
 
+    /// Returns the [`Span`] of the name of an optional named argument.
+    ///
+    /// This can be used in errors for named arguments that don't take values.
+    pub fn get_flag_span(&self, flag_name: &str) -> Option<Span> {
+        self.named
+            .iter()
+            .find(|(name, _)| name.item == flag_name)
+            .map(|(name, _)| name.span)
+    }
+
     /// Returns the [`Value`] of an optional named argument
     ///
     /// # Examples
diff --git a/crates/nu-plugin-protocol/src/lib.rs b/crates/nu-plugin-protocol/src/lib.rs
index 739366d910..a9196a2d8d 100644
--- a/crates/nu-plugin-protocol/src/lib.rs
+++ b/crates/nu-plugin-protocol/src/lib.rs
@@ -22,7 +22,7 @@ mod tests;
 pub mod test_util;
 
 use nu_protocol::{
-    ast::Operator, engine::Closure, ByteStreamType, Config, LabeledError, PipelineData,
+    ast::Operator, engine::Closure, ByteStreamType, Config, DeclId, LabeledError, PipelineData,
     PluginMetadata, PluginSignature, ShellError, Span, Spanned, Value,
 };
 use nu_utils::SharedCow;
@@ -494,6 +494,21 @@ pub enum EngineCall<D> {
         /// Whether to redirect stderr from external commands
         redirect_stderr: bool,
     },
+    /// Find a declaration by name
+    FindDecl(String),
+    /// Call a declaration with args
+    CallDecl {
+        /// The id of the declaration to be called (can be found with `FindDecl`)
+        decl_id: DeclId,
+        /// Information about the call (head span, arguments, etc.)
+        call: EvaluatedCall,
+        /// Pipeline input to the call
+        input: D,
+        /// Whether to redirect stdout from external commands
+        redirect_stdout: bool,
+        /// Whether to redirect stderr from external commands
+        redirect_stderr: bool,
+    },
 }
 
 impl<D> EngineCall<D> {
@@ -511,6 +526,8 @@ impl<D> EngineCall<D> {
             EngineCall::LeaveForeground => "LeaveForeground",
             EngineCall::GetSpanContents(_) => "GetSpanContents",
             EngineCall::EvalClosure { .. } => "EvalClosure",
+            EngineCall::FindDecl(_) => "FindDecl",
+            EngineCall::CallDecl { .. } => "CallDecl",
         }
     }
 
@@ -544,6 +561,20 @@ impl<D> EngineCall<D> {
                 redirect_stdout,
                 redirect_stderr,
             },
+            EngineCall::FindDecl(name) => EngineCall::FindDecl(name),
+            EngineCall::CallDecl {
+                decl_id,
+                call,
+                input,
+                redirect_stdout,
+                redirect_stderr,
+            } => EngineCall::CallDecl {
+                decl_id,
+                call,
+                input: f(input)?,
+                redirect_stdout,
+                redirect_stderr,
+            },
         })
     }
 }
@@ -556,6 +587,7 @@ pub enum EngineCallResponse<D> {
     PipelineData(D),
     Config(SharedCow<Config>),
     ValueMap(HashMap<String, Value>),
+    Identifier(usize),
 }
 
 impl<D> EngineCallResponse<D> {
@@ -570,6 +602,7 @@ impl<D> EngineCallResponse<D> {
             EngineCallResponse::PipelineData(data) => EngineCallResponse::PipelineData(f(data)?),
             EngineCallResponse::Config(config) => EngineCallResponse::Config(config),
             EngineCallResponse::ValueMap(map) => EngineCallResponse::ValueMap(map),
+            EngineCallResponse::Identifier(id) => EngineCallResponse::Identifier(id),
         })
     }
 }
diff --git a/crates/nu-plugin/src/plugin/interface/mod.rs b/crates/nu-plugin/src/plugin/interface/mod.rs
index 60c5964f26..c20e2adc39 100644
--- a/crates/nu-plugin/src/plugin/interface/mod.rs
+++ b/crates/nu-plugin/src/plugin/interface/mod.rs
@@ -6,12 +6,12 @@ use nu_plugin_core::{
     StreamManagerHandle,
 };
 use nu_plugin_protocol::{
-    CallInfo, CustomValueOp, EngineCall, EngineCallId, EngineCallResponse, Ordering, PluginCall,
-    PluginCallId, PluginCallResponse, PluginCustomValue, PluginInput, PluginOption, PluginOutput,
-    ProtocolInfo,
+    CallInfo, CustomValueOp, EngineCall, EngineCallId, EngineCallResponse, EvaluatedCall, Ordering,
+    PluginCall, PluginCallId, PluginCallResponse, PluginCustomValue, PluginInput, PluginOption,
+    PluginOutput, ProtocolInfo,
 };
 use nu_protocol::{
-    engine::Closure, Config, LabeledError, PipelineData, PluginMetadata, PluginSignature,
+    engine::Closure, Config, DeclId, LabeledError, PipelineData, PluginMetadata, PluginSignature,
     ShellError, Signals, Span, Spanned, Value,
 };
 use nu_utils::SharedCow;
@@ -872,6 +872,71 @@ impl EngineInterface {
         }
     }
 
+    /// Ask the engine for the identifier for a declaration. If found, the result can then be passed
+    /// to [`.call_decl()`] to call other internal commands.
+    ///
+    /// See [`.call_decl()`] for an example.
+    pub fn find_decl(&self, name: impl Into<String>) -> Result<Option<DeclId>, ShellError> {
+        let call = EngineCall::FindDecl(name.into());
+
+        match self.engine_call(call)? {
+            EngineCallResponse::Error(err) => Err(err),
+            EngineCallResponse::Identifier(id) => Ok(Some(id)),
+            EngineCallResponse::PipelineData(PipelineData::Empty) => Ok(None),
+            _ => Err(ShellError::PluginFailedToDecode {
+                msg: "Received unexpected response type for EngineCall::FindDecl".into(),
+            }),
+        }
+    }
+
+    /// Ask the engine to call an internal command, using the declaration ID previously looked up
+    /// with [`.find_decl()`].
+    ///
+    /// # Example
+    ///
+    /// ```rust,no_run
+    /// # use nu_protocol::{Value, ShellError, PipelineData};
+    /// # use nu_plugin::{EngineInterface, EvaluatedCall};
+    /// # fn example(engine: &EngineInterface, call: &EvaluatedCall) -> Result<Value, ShellError> {
+    /// if let Some(decl_id) = engine.find_decl("scope commands")? {
+    ///     let commands = engine.call_decl(
+    ///         decl_id,
+    ///         EvaluatedCall::new(call.head),
+    ///         PipelineData::Empty,
+    ///         true,
+    ///         false,
+    ///     )?;
+    ///     commands.into_value(call.head)
+    /// } else {
+    ///     Ok(Value::list(vec![], call.head))
+    /// }
+    /// # }
+    /// ```
+    pub fn call_decl(
+        &self,
+        decl_id: DeclId,
+        call: EvaluatedCall,
+        input: PipelineData,
+        redirect_stdout: bool,
+        redirect_stderr: bool,
+    ) -> Result<PipelineData, ShellError> {
+        let call = EngineCall::CallDecl {
+            decl_id,
+            call,
+            input,
+            redirect_stdout,
+            redirect_stderr,
+        };
+
+        match self.engine_call(call)? {
+            EngineCallResponse::Error(err) => Err(err),
+            EngineCallResponse::PipelineData(data) => Ok(data),
+            _ => Err(ShellError::PluginFailedToDecode {
+                msg: "Received unexpected response type for EngineCall::CallDecl".into(),
+            }),
+        }
+    }
+
     /// Tell the engine whether to disable garbage collection for this plugin.
     ///
     /// The garbage collector is enabled by default, but plugins can turn it off (ideally
diff --git a/crates/nu-protocol/src/ir/call.rs b/crates/nu-protocol/src/ir/call.rs
index 3d16f82eb6..b931373017 100644
--- a/crates/nu-protocol/src/ir/call.rs
+++ b/crates/nu-protocol/src/ir/call.rs
@@ -239,7 +239,7 @@ impl CallBuilder {
         }
         self.inner.args_len += 1;
         if let Some(span) = argument.span() {
-            self.inner.span = self.inner.span.append(span);
+            self.inner.span = self.inner.span.merge(span);
         }
         stack.arguments.push(argument);
         self
diff --git a/crates/nu_plugin_example/src/commands/call_decl.rs b/crates/nu_plugin_example/src/commands/call_decl.rs
new file mode 100644
index 0000000000..d8c8206d90
--- /dev/null
+++ b/crates/nu_plugin_example/src/commands/call_decl.rs
@@ -0,0 +1,78 @@
+use nu_plugin::{EngineInterface, EvaluatedCall, PluginCommand};
+use nu_protocol::{
+    IntoSpanned, LabeledError, PipelineData, Record, Signature, Spanned, SyntaxShape, Value,
+};
+
+use crate::ExamplePlugin;
+
+pub struct CallDecl;
+
+impl PluginCommand for CallDecl {
+    type Plugin = ExamplePlugin;
+
+    fn name(&self) -> &str {
+        "example call-decl"
+    }
+
+    fn signature(&self) -> Signature {
+        Signature::build(self.name())
+            .required(
+                "name",
+                SyntaxShape::String,
+                "the name of the command to call",
+            )
+            .optional(
+                "named_args",
+                SyntaxShape::Record(vec![]),
+                "named arguments to pass to the command",
+            )
+            .rest(
+                "positional_args",
+                SyntaxShape::Any,
+                "positional arguments to pass to the command",
+            )
+    }
+
+    fn usage(&self) -> &str {
+        "Demonstrates calling other commands from plugins using `call_decl()`."
+    }
+
+    fn extra_usage(&self) -> &str {
+        "
+The arguments will not be typechecked at parse time. This command is for
+demonstration only, and should not be used for anything real.
+"
+        .trim()
+    }
+
+    fn run(
+        &self,
+        _plugin: &ExamplePlugin,
+        engine: &EngineInterface,
+        call: &EvaluatedCall,
+        input: PipelineData,
+    ) -> Result<PipelineData, LabeledError> {
+        let name: Spanned<String> = call.req(0)?;
+        let named_args: Option<Record> = call.opt(1)?;
+        let positional_args: Vec<Value> = call.rest(2)?;
+
+        let decl_id = engine.find_decl(&name.item)?.ok_or_else(|| {
+            LabeledError::new(format!("Can't find `{}`", name.item))
+                .with_label("not in scope", name.span)
+        })?;
+
+        let mut new_call = EvaluatedCall::new(call.head);
+
+        for (key, val) in named_args.into_iter().flatten() {
+            new_call.add_named(key.into_spanned(val.span()), val);
+        }
+
+        for val in positional_args {
+            new_call.add_positional(val);
+        }
+
+        let result = engine.call_decl(decl_id, new_call, input, true, false)?;
+
+        Ok(result)
+    }
+}
diff --git a/crates/nu_plugin_example/src/commands/mod.rs b/crates/nu_plugin_example/src/commands/mod.rs
index dd808616a9..b1703a3a47 100644
--- a/crates/nu_plugin_example/src/commands/mod.rs
+++ b/crates/nu_plugin_example/src/commands/mod.rs
@@ -13,11 +13,13 @@ pub use three::Three;
 pub use two::Two;
 
 // Engine interface demos
+mod call_decl;
 mod config;
 mod disable_gc;
 mod env;
 mod view_span;
 
+pub use call_decl::CallDecl;
 pub use config::Config;
 pub use disable_gc::DisableGc;
 pub use env::Env;
diff --git a/crates/nu_plugin_example/src/lib.rs b/crates/nu_plugin_example/src/lib.rs
index acbebf9b39..3db97659bf 100644
--- a/crates/nu_plugin_example/src/lib.rs
+++ b/crates/nu_plugin_example/src/lib.rs
@@ -27,6 +27,7 @@ impl Plugin for ExamplePlugin {
             Box::new(Env),
             Box::new(ViewSpan),
             Box::new(DisableGc),
+            Box::new(CallDecl),
             // Stream demos
             Box::new(CollectBytes),
             Box::new(Echo),
diff --git a/tests/plugins/call_decl.rs b/tests/plugins/call_decl.rs
new file mode 100644
index 0000000000..06aded3f42
--- /dev/null
+++ b/tests/plugins/call_decl.rs
@@ -0,0 +1,42 @@
+use nu_test_support::nu_with_plugins;
+
+#[test]
+fn call_to_json() {
+    let result = nu_with_plugins!(
+        cwd: ".",
+        plugin: ("nu_plugin_example"),
+        r#"
+            [42] | example call-decl 'to json' {indent: 4}
+        "#
+    );
+    assert!(result.status.success());
+    // newlines are removed from test output
+    assert_eq!("[    42]", result.out);
+}
+
+#[test]
+fn call_reduce() {
+    let result = nu_with_plugins!(
+        cwd: ".",
+        plugin: ("nu_plugin_example"),
+        r#"
+            [1 2 3] | example call-decl 'reduce' {fold: 10} { |it, acc| $it + $acc }
+        "#
+    );
+    assert!(result.status.success());
+    assert_eq!("16", result.out);
+}
+
+#[test]
+fn call_scope_variables() {
+    let result = nu_with_plugins!(
+        cwd: ".",
+        plugin: ("nu_plugin_example"),
+        r#"
+            let test_var = 10
+            example call-decl 'scope variables' | where name == '$test_var' | length
+        "#
+    );
+    assert!(result.status.success());
+    assert_eq!("1", result.out);
+}
diff --git a/tests/plugins/mod.rs b/tests/plugins/mod.rs
index 605f78b564..cdcf3a9c94 100644
--- a/tests/plugins/mod.rs
+++ b/tests/plugins/mod.rs
@@ -1,3 +1,4 @@
+mod call_decl;
 mod config;
 mod core_inc;
 mod custom_values;

From e8764de3c67281a513a36457656de80c652e8b06 Mon Sep 17 00:00:00 2001
From: Wind <WindSoilder@outlook.com>
Date: Fri, 19 Jul 2024 15:21:02 +0800
Subject: [PATCH 05/14] don't allow break/continue in `each` and `items`
 command (#13398)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

# Description
Fixes: #11451

# User-Facing Changes
### Before
```nushell
❯ [1 2 3] | each {|e| break; print $e }
╭────────────╮
│ empty list │
╰────────────╯
```

### After
```
❯ [1 2 3] | each {|e| break; print $e }
Error: nu::shell::eval_block_with_input

  × Eval block failed with pipeline input
   ╭─[entry #9:1:2]
 1 │ [1 2 3] | each {|e| break; print $e }
   ·  ┬
   ·  ╰── source value
   ╰────

Error:   × Break used outside of loop
   ╭─[entry #9:1:21]
 1 │ [1 2 3] | each {|e| break; print $e }
   ·                     ──┬──
   ·                       ╰── used outside of loop
   ╰────
```

# Tests + Formatting
Removes some tests.
---
 crates/nu-cmd-lang/src/core_commands/break_.rs   |  4 +++-
 .../nu-cmd-lang/src/core_commands/continue_.rs   |  4 +++-
 crates/nu-command/src/filters/each.rs            |  8 --------
 crates/nu-command/src/filters/items.rs           |  1 -
 crates/nu-command/tests/commands/break_.rs       |  9 ---------
 crates/nu-command/tests/commands/each.rs         | 16 ----------------
 6 files changed, 6 insertions(+), 36 deletions(-)

diff --git a/crates/nu-cmd-lang/src/core_commands/break_.rs b/crates/nu-cmd-lang/src/core_commands/break_.rs
index 90cc1a73f2..943e32a531 100644
--- a/crates/nu-cmd-lang/src/core_commands/break_.rs
+++ b/crates/nu-cmd-lang/src/core_commands/break_.rs
@@ -21,7 +21,9 @@ impl Command for Break {
 
     fn extra_usage(&self) -> &str {
         r#"This command is a parser keyword. For details, check:
-  https://www.nushell.sh/book/thinking_in_nu.html"#
+  https://www.nushell.sh/book/thinking_in_nu.html
+
+  break can only be used in while, loop, and for loops. It can not be used with each or other filter commands"#
     }
 
     fn command_type(&self) -> CommandType {
diff --git a/crates/nu-cmd-lang/src/core_commands/continue_.rs b/crates/nu-cmd-lang/src/core_commands/continue_.rs
index cfa3e38335..b795456c1d 100644
--- a/crates/nu-cmd-lang/src/core_commands/continue_.rs
+++ b/crates/nu-cmd-lang/src/core_commands/continue_.rs
@@ -21,7 +21,9 @@ impl Command for Continue {
 
     fn extra_usage(&self) -> &str {
         r#"This command is a parser keyword. For details, check:
-  https://www.nushell.sh/book/thinking_in_nu.html"#
+  https://www.nushell.sh/book/thinking_in_nu.html
+
+  continue can only be used in while, loop, and for loops. It can not be used with each or other filter commands"#
     }
 
     fn command_type(&self) -> CommandType {
diff --git a/crates/nu-command/src/filters/each.rs b/crates/nu-command/src/filters/each.rs
index fa1bf390b8..a7035917b3 100644
--- a/crates/nu-command/src/filters/each.rs
+++ b/crates/nu-command/src/filters/each.rs
@@ -132,8 +132,6 @@ with 'transpose' first."#
                             Ok(data) => Some(data.into_value(head).unwrap_or_else(|err| {
                                 Value::error(chain_error_with_input(err, is_error, span), span)
                             })),
-                            Err(ShellError::Continue { span }) => Some(Value::nothing(span)),
-                            Err(ShellError::Break { .. }) => None,
                             Err(error) => {
                                 let error = chain_error_with_input(error, is_error, span);
                                 Some(Value::error(error, span))
@@ -149,10 +147,6 @@ with 'transpose' first."#
                         .map_while(move |value| {
                             let value = match value {
                                 Ok(value) => value,
-                                Err(ShellError::Continue { span }) => {
-                                    return Some(Value::nothing(span))
-                                }
-                                Err(ShellError::Break { .. }) => return None,
                                 Err(err) => return Some(Value::error(err, head)),
                             };
 
@@ -163,8 +157,6 @@ with 'transpose' first."#
                                 .and_then(|data| data.into_value(head))
                             {
                                 Ok(value) => Some(value),
-                                Err(ShellError::Continue { span }) => Some(Value::nothing(span)),
-                                Err(ShellError::Break { .. }) => None,
                                 Err(error) => {
                                     let error = chain_error_with_input(error, is_error, span);
                                     Some(Value::error(error, span))
diff --git a/crates/nu-command/src/filters/items.rs b/crates/nu-command/src/filters/items.rs
index 04a4c6d672..fa7b680afd 100644
--- a/crates/nu-command/src/filters/items.rs
+++ b/crates/nu-command/src/filters/items.rs
@@ -60,7 +60,6 @@ impl Command for Items {
 
                                 match result {
                                     Ok(value) => Some(value),
-                                    Err(ShellError::Break { .. }) => None,
                                     Err(err) => {
                                         let err = chain_error_with_input(err, false, span);
                                         Some(Value::error(err, head))
diff --git a/crates/nu-command/tests/commands/break_.rs b/crates/nu-command/tests/commands/break_.rs
index 92a6564658..7398ac62e5 100644
--- a/crates/nu-command/tests/commands/break_.rs
+++ b/crates/nu-command/tests/commands/break_.rs
@@ -15,12 +15,3 @@ fn break_while_loop() {
 
     assert_eq!(actual.out, "hello");
 }
-
-#[test]
-fn break_each() {
-    let actual = nu!("
-        [1, 2, 3, 4, 5] | each {|x| if $x > 3 { break }; $x} | math sum
-        ");
-
-    assert_eq!(actual.out, "6");
-}
diff --git a/crates/nu-command/tests/commands/each.rs b/crates/nu-command/tests/commands/each.rs
index f8e87d5537..65b922bb6e 100644
--- a/crates/nu-command/tests/commands/each.rs
+++ b/crates/nu-command/tests/commands/each.rs
@@ -58,22 +58,6 @@ fn each_while_uses_enumerate_index() {
     assert_eq!(actual.out, "[0, 1, 2, 3]");
 }
 
-#[test]
-fn each_element_continue_command() {
-    let actual =
-        nu!("[1,2,3,4,6,7] | each { |x| if ($x mod 2 == 0) {continue} else { $x }} | to nuon");
-
-    assert_eq!(actual.out, "[1, 3, 7]");
-}
-
-#[test]
-fn each_element_break_command() {
-    let actual =
-        nu!("[1,2,5,4,6,7] | each { |x| if ($x mod 3 == 0) {break} else { $x }} | to nuon");
-
-    assert_eq!(actual.out, "[1, 2, 5, 4]");
-}
-
 #[test]
 fn errors_in_nested_each_show() {
     let actual = nu!("[[1,2]] | each {|x| $x | each {|y| error make {msg: \"oh noes\"} } }");

From e281c03403aaf71de3cf3090bb12011dd578090d Mon Sep 17 00:00:00 2001
From: Wind <WindSoilder@outlook.com>
Date: Fri, 19 Jul 2024 15:22:28 +0800
Subject: [PATCH 06/14] generate: switch the position of <initial> and
 <closure>, so the closure can have default parameters  (#13393)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

# Description
Close: #12083
Close: #12084

# User-Facing Changes
It's a breaking change because we have switched the position of
`<initial>` and `<closure>`, after the change, initial value will be
optional. So it's possible to do something like this:

```nushell
> let f = {|fib = [0, 1]| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} }
> generate $f | first 5
╭───┬───╮
│ 0 │ 0 │
│ 1 │ 1 │
│ 2 │ 1 │
│ 3 │ 2 │
│ 4 │ 3 │
╰───┴───╯
```

It will also raise error if user don't give initial value, and the
closure don't have default parameter.
```nushell
❯ let f = {|fib| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} }
❯ generate $f
Error:   × The initial value is missing
   ╭─[entry #5:1:1]
 1 │ generate $f
   · ────┬───
   ·     ╰── Missing intial value
   ╰────
  help: Provide <initial> value in generate, or assigning default value to closure parameter

```

# Tests + Formatting
Added some test cases.

---------

Co-authored-by: Stefan Holderbach <sholderbach@users.noreply.github.com>
---
 crates/nu-command/src/generators/generate.rs | 53 +++++++++--
 crates/nu-command/tests/commands/generate.rs | 95 +++++++++++++++-----
 2 files changed, 117 insertions(+), 31 deletions(-)

diff --git a/crates/nu-command/src/generators/generate.rs b/crates/nu-command/src/generators/generate.rs
index 8d4f48d3fb..2b2cf838f4 100644
--- a/crates/nu-command/src/generators/generate.rs
+++ b/crates/nu-command/src/generators/generate.rs
@@ -12,12 +12,12 @@ impl Command for Generate {
     fn signature(&self) -> Signature {
         Signature::build("generate")
             .input_output_types(vec![(Type::Nothing, Type::List(Box::new(Type::Any)))])
-            .required("initial", SyntaxShape::Any, "Initial value.")
             .required(
                 "closure",
                 SyntaxShape::Closure(Some(vec![SyntaxShape::Any])),
                 "Generator function.",
             )
+            .optional("initial", SyntaxShape::Any, "Initial value.")
             .allow_variants_without_examples(true)
             .category(Category::Generators)
     }
@@ -41,7 +41,7 @@ used as the next argument to the closure, otherwise generation stops.
     fn examples(&self) -> Vec<Example> {
         vec![
             Example {
-                example: "generate 0 {|i| if $i <= 10 { {out: $i, next: ($i + 2)} }}",
+                example: "generate {|i| if $i <= 10 { {out: $i, next: ($i + 2)} }} 0",
                 description: "Generate a sequence of numbers",
                 result: Some(Value::list(
                     vec![
@@ -57,10 +57,17 @@ used as the next argument to the closure, otherwise generation stops.
             },
             Example {
                 example:
-                    "generate [0, 1] {|fib| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} }",
+                    "generate {|fib| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} } [0, 1]",
                 description: "Generate a continuous stream of Fibonacci numbers",
                 result: None,
             },
+            Example {
+                example:
+                    "generate {|fib=[0, 1]| {out: $fib.0, next: [$fib.1, ($fib.0 + $fib.1)]} }",
+                description:
+                    "Generate a continuous stream of Fibonacci numbers, using default parameters",
+                result: None,
+            },
         ]
     }
 
@@ -72,15 +79,15 @@ used as the next argument to the closure, otherwise generation stops.
         _input: PipelineData,
     ) -> Result<PipelineData, ShellError> {
         let head = call.head;
-        let initial: Value = call.req(engine_state, stack, 0)?;
-        let closure: Closure = call.req(engine_state, stack, 1)?;
-
+        let closure: Closure = call.req(engine_state, stack, 0)?;
+        let initial: Option<Value> = call.opt(engine_state, stack, 1)?;
+        let block = engine_state.get_block(closure.block_id);
         let mut closure = ClosureEval::new(engine_state, stack, closure);
 
         // A type of Option<S> is used to represent state. Invocation
         // will stop on None. Using Option<S> allows functions to output
         // one final value before stopping.
-        let mut state = Some(initial);
+        let mut state = Some(get_initial_state(initial, &block.signature, call.head)?);
         let iter = std::iter::from_fn(move || {
             let arg = state.take()?;
 
@@ -170,6 +177,38 @@ used as the next argument to the closure, otherwise generation stops.
     }
 }
 
+fn get_initial_state(
+    initial: Option<Value>,
+    signature: &Signature,
+    span: Span,
+) -> Result<Value, ShellError> {
+    match initial {
+        Some(v) => Ok(v),
+        None => {
+            // the initial state should be referred from signature
+            if !signature.optional_positional.is_empty()
+                && signature.optional_positional[0].default_value.is_some()
+            {
+                Ok(signature.optional_positional[0]
+                    .default_value
+                    .clone()
+                    .expect("Already checked default value"))
+            } else {
+                Err(ShellError::GenericError {
+                    error: "The initial value is missing".to_string(),
+                    msg: "Missing initial value".to_string(),
+                    span: Some(span),
+                    help: Some(
+                        "Provide an <initial> value as an argument to generate, or assign a default value to the closure parameter"
+                            .to_string(),
+                    ),
+                    inner: vec![],
+                })
+            }
+        }
+    }
+}
+
 #[cfg(test)]
 mod test {
     use super::*;
diff --git a/crates/nu-command/tests/commands/generate.rs b/crates/nu-command/tests/commands/generate.rs
index 9dc2142515..9ae3b8823e 100644
--- a/crates/nu-command/tests/commands/generate.rs
+++ b/crates/nu-command/tests/commands/generate.rs
@@ -3,7 +3,7 @@ use nu_test_support::{nu, pipeline};
 #[test]
 fn generate_no_next_break() {
     let actual = nu!(
-        "generate 1 {|x| if $x == 3 { {out: $x}} else { {out: $x, next: ($x + 1)} }} | to nuon"
+        "generate {|x| if $x == 3 { {out: $x}} else { {out: $x, next: ($x + 1)} }} 1 | to nuon"
     );
 
     assert_eq!(actual.out, "[1, 2, 3]");
@@ -11,7 +11,7 @@ fn generate_no_next_break() {
 
 #[test]
 fn generate_null_break() {
-    let actual = nu!("generate 1 {|x| if $x <= 3 { {out: $x, next: ($x + 1)} }} | to nuon");
+    let actual = nu!("generate {|x| if $x <= 3 { {out: $x, next: ($x + 1)} }} 1 | to nuon");
 
     assert_eq!(actual.out, "[1, 2, 3]");
 }
@@ -20,13 +20,13 @@ fn generate_null_break() {
 fn generate_allows_empty_output() {
     let actual = nu!(pipeline(
         r#"
-        generate 0 {|x|
+        generate {|x|
           if $x == 1 {
             {next: ($x + 1)}
           } else if $x < 3 {
             {out: $x, next: ($x + 1)}
           }
-        } | to nuon
+        } 0 | to nuon
       "#
     ));
 
@@ -37,11 +37,11 @@ fn generate_allows_empty_output() {
 fn generate_allows_no_output() {
     let actual = nu!(pipeline(
         r#"
-        generate 0 {|x|
+        generate {|x|
           if $x < 3 {
             {next: ($x + 1)}
           }
-        } | to nuon
+        } 0 | to nuon
       "#
     ));
 
@@ -52,7 +52,7 @@ fn generate_allows_no_output() {
 fn generate_allows_null_state() {
     let actual = nu!(pipeline(
         r#"
-        generate 0 {|x|
+        generate {|x|
           if $x == null {
             {out: "done"}
           } else if $x < 1 {
@@ -60,7 +60,7 @@ fn generate_allows_null_state() {
           } else {
             {out: "stopping", next: null}
           }
-        } | to nuon
+        } 0 | to nuon
       "#
     ));
 
@@ -71,7 +71,42 @@ fn generate_allows_null_state() {
 fn generate_allows_null_output() {
     let actual = nu!(pipeline(
         r#"
-        generate 0 {|x|
+        generate {|x|
+          if $x == 3 {
+            {out: "done"}
+          } else {
+            {out: null, next: ($x + 1)}
+          }
+        } 0 | to nuon
+      "#
+    ));
+
+    assert_eq!(actual.out, "[null, null, null, done]");
+}
+
+#[test]
+fn generate_disallows_extra_keys() {
+    let actual = nu!("generate {|x| {foo: bar, out: $x}} 0 ");
+    assert!(actual.err.contains("Invalid block return"));
+}
+
+#[test]
+fn generate_disallows_list() {
+    let actual = nu!("generate {|x| [$x, ($x + 1)]} 0 ");
+    assert!(actual.err.contains("Invalid block return"));
+}
+
+#[test]
+fn generate_disallows_primitive() {
+    let actual = nu!("generate {|x| 1} 0");
+    assert!(actual.err.contains("Invalid block return"));
+}
+
+#[test]
+fn generate_allow_default_parameter() {
+    let actual = nu!(pipeline(
+        r#"
+        generate {|x = 0|
           if $x == 3 {
             {out: "done"}
           } else {
@@ -82,22 +117,34 @@ fn generate_allows_null_output() {
     ));
 
     assert_eq!(actual.out, "[null, null, null, done]");
+
+    // if initial is given, use initial value
+    let actual = nu!(pipeline(
+        r#"
+        generate {|x = 0|
+          if $x == 3 {
+            {out: "done"}
+          } else {
+            {out: null, next: ($x + 1)}
+          }
+        } 1 | to nuon
+      "#
+    ));
+    assert_eq!(actual.out, "[null, null, done]");
 }
 
 #[test]
-fn generate_disallows_extra_keys() {
-    let actual = nu!("generate 0 {|x| {foo: bar, out: $x}}");
-    assert!(actual.err.contains("Invalid block return"));
-}
-
-#[test]
-fn generate_disallows_list() {
-    let actual = nu!("generate 0 {|x| [$x, ($x + 1)]}");
-    assert!(actual.err.contains("Invalid block return"));
-}
-
-#[test]
-fn generate_disallows_primitive() {
-    let actual = nu!("generate 0 {|x| 1}");
-    assert!(actual.err.contains("Invalid block return"));
+fn generate_raise_error_on_no_default_parameter_closure_and_init_val() {
+    let actual = nu!(pipeline(
+        r#"
+        generate {|x|
+          if $x == 3 {
+            {out: "done"}
+          } else {
+            {out: null, next: ($x + 1)}
+          }
+        } | to nuon
+      "#
+    ));
+    assert!(actual.err.contains("The initial value is missing"));
 }

From 4665323bb4ba0931c690bf743b54a27ec493822a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Jan=20Christian=20Gr=C3=BCnhage?=
 <jan.christian@gruenhage.xyz>
Date: Fri, 19 Jul 2024 12:47:07 +0200
Subject: [PATCH 07/14] Use directories for autoloading (#13382)

fixes https://github.com/nushell/nushell/issues/13378

# Description

This PR tries to improve usage of system APIs to determine the location
of vendored autoload files.

# User-Facing Changes
The paths listed in #13180 and #13217 are changing. This has not been
part of a release yet, so arguably the user facing changes are only to
unreleased features anyway.

# Tests + Formatting
Haven't done, but if someone wants to help me here, I'm open to doing
it. I just don't know how to properly test this.

# After Submitting
---
 Cargo.lock                             |   3 +
 Cargo.toml                             |   2 +
 crates/nu-cli/tests/completions/mod.rs |   2 +-
 crates/nu-protocol/Cargo.toml          |   5 +
 crates/nu-protocol/src/eval_const.rs   | 143 ++++++++++++++++---------
 src/config_files.rs                    |   3 +-
 6 files changed, 107 insertions(+), 51 deletions(-)

diff --git a/Cargo.lock b/Cargo.lock
index 49770a9a0d..97f81d5185 100644
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -3344,6 +3344,8 @@ dependencies = [
  "chrono",
  "chrono-humanize",
  "convert_case",
+ "dirs",
+ "dirs-sys",
  "fancy-regex",
  "indexmap",
  "log",
@@ -3367,6 +3369,7 @@ dependencies = [
  "tempfile",
  "thiserror",
  "typetag",
+ "windows-sys 0.48.0",
 ]
 
 [[package]]
diff --git a/Cargo.toml b/Cargo.toml
index 335e2f0384..81602fb4b1 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -84,6 +84,7 @@ deunicode = "1.6.0"
 dialoguer = { default-features = false, version = "0.11" }
 digest = { default-features = false, version = "0.10" }
 dirs = "5.0"
+dirs-sys = "0.4"
 dtparse = "2.0"
 encoding_rs = "0.8"
 fancy-regex = "0.13"
@@ -178,6 +179,7 @@ v_htmlescape = "0.15.0"
 wax = "0.6"
 which = "6.0.0"
 windows = "0.54"
+windows-sys = "0.48"
 winreg = "0.52"
 
 [dependencies]
diff --git a/crates/nu-cli/tests/completions/mod.rs b/crates/nu-cli/tests/completions/mod.rs
index 590979251e..aa51e5c9bb 100644
--- a/crates/nu-cli/tests/completions/mod.rs
+++ b/crates/nu-cli/tests/completions/mod.rs
@@ -833,7 +833,7 @@ fn variables_completions() {
         "plugin-path".into(),
         "startup-time".into(),
         "temp-path".into(),
-        "vendor-autoload-dir".into(),
+        "vendor-autoload-dirs".into(),
     ];
 
     // Match results
diff --git a/crates/nu-protocol/Cargo.toml b/crates/nu-protocol/Cargo.toml
index eaa861a073..ea900bf08d 100644
--- a/crates/nu-protocol/Cargo.toml
+++ b/crates/nu-protocol/Cargo.toml
@@ -23,6 +23,7 @@ byte-unit = { version = "5.1", features = [ "serde" ] }
 chrono = { workspace = true, features = [ "serde", "std", "unstable-locales" ], default-features = false }
 chrono-humanize = { workspace = true }
 convert_case = { workspace = true }
+dirs = { workspace = true }
 fancy-regex = { workspace = true }
 indexmap = { workspace = true }
 lru = { workspace = true }
@@ -38,6 +39,10 @@ log = { workspace = true }
 [target.'cfg(unix)'.dependencies]
 nix = { workspace = true, default-features = false, features = ["signal"] }
 
+[target.'cfg(windows)'.dependencies]
+dirs-sys = { workspace = true }
+windows-sys = { workspace = true }
+
 [features]
 plugin = [
   "brotli",
diff --git a/crates/nu-protocol/src/eval_const.rs b/crates/nu-protocol/src/eval_const.rs
index cd4780773f..4cb5ccc0f5 100644
--- a/crates/nu-protocol/src/eval_const.rs
+++ b/crates/nu-protocol/src/eval_const.rs
@@ -185,24 +185,15 @@ pub(crate) fn create_nu_constant(engine_state: &EngineState, span: Span) -> Valu
         },
     );
 
-    // Create a system level directory for nushell scripts, modules, completions, etc
-    // that can be changed by setting the NU_VENDOR_AUTOLOAD_DIR env var on any platform
-    // before nushell is compiled OR if NU_VENDOR_AUTOLOAD_DIR is not set for non-windows
-    // systems, the PREFIX env var can be set before compile and used as PREFIX/nushell/vendor/autoload
     record.push(
-        "vendor-autoload-dir",
-        // pseudo code
-        // if env var NU_VENDOR_AUTOLOAD_DIR is set, in any platform, use it
-        // if not, if windows, use ALLUSERPROFILE\nushell\vendor\autoload
-        // if not, if non-windows, if env var PREFIX is set, use PREFIX/share/nushell/vendor/autoload
-        // if not, use the default /usr/share/nushell/vendor/autoload
-
-        // check to see if NU_VENDOR_AUTOLOAD_DIR env var is set, if not, use the default
-        if let Some(path) = get_vendor_autoload_dir(engine_state) {
-            Value::string(path.to_string_lossy(), span)
-        } else {
-            Value::error(ShellError::ConfigDirNotFound { span: Some(span) }, span)
-        },
+        "vendor-autoload-dirs",
+        Value::list(
+            get_vendor_autoload_dirs(engine_state)
+                .iter()
+                .map(|path| Value::string(path.to_string_lossy(), span))
+                .collect(),
+            span,
+        ),
     );
 
     record.push("temp-path", {
@@ -259,39 +250,95 @@ pub(crate) fn create_nu_constant(engine_state: &EngineState, span: Span) -> Valu
     Value::record(record, span)
 }
 
-pub fn get_vendor_autoload_dir(engine_state: &EngineState) -> Option<PathBuf> {
-    // pseudo code
-    // if env var NU_VENDOR_AUTOLOAD_DIR is set, in any platform, use it
-    // if not, if windows, use ALLUSERPROFILE\nushell\vendor\autoload
-    // if not, if non-windows, if env var PREFIX is set, use PREFIX/share/nushell/vendor/autoload
-    // if not, use the default /usr/share/nushell/vendor/autoload
+pub fn get_vendor_autoload_dirs(_engine_state: &EngineState) -> Vec<PathBuf> {
+    // load order for autoload dirs
+    // /Library/Application Support/nushell/vendor/autoload on macOS
+    // <dir>/nushell/vendor/autoload for every dir in XDG_DATA_DIRS in reverse order on platforms other than windows. If XDG_DATA_DIRS is not set, it falls back to <PREFIX>/share if PREFIX ends in local, or <PREFIX>/local/share:<PREFIX>/share otherwise. If PREFIX is not set, fall back to /usr/local/share:/usr/share.
+    // %ProgramData%\nushell\vendor\autoload on windows
+    // NU_VENDOR_AUTOLOAD_DIR from compile time, if env var is set at compile time
+    // if on macOS, additionally check XDG_DATA_HOME, which `dirs` is only doing on Linux
+    // <data_dir>/nushell/vendor/autoload of the current user according to the `dirs` crate
+    // NU_VENDOR_AUTOLOAD_DIR at runtime, if env var is set
 
-    // check to see if NU_VENDOR_AUTOLOAD_DIR env var is set, if not, use the default
-    Some(
-        option_env!("NU_VENDOR_AUTOLOAD_DIR")
-            .map(String::from)
-            .unwrap_or_else(|| {
-                if cfg!(windows) {
-                    let all_user_profile = match engine_state.get_env_var("ALLUSERPROFILE") {
-                        Some(v) => format!(
-                            "{}\\nushell\\vendor\\autoload",
-                            v.coerce_string().unwrap_or("C:\\ProgramData".into())
-                        ),
-                        None => "C:\\ProgramData\\nushell\\vendor\\autoload".into(),
-                    };
-                    all_user_profile
-                } else {
-                    // In non-Windows environments, if NU_VENDOR_AUTOLOAD_DIR is not set
-                    // check to see if PREFIX env var is set, and use it as PREFIX/nushell/vendor/autoload
-                    // otherwise default to /usr/share/nushell/vendor/autoload
-                    option_env!("PREFIX").map(String::from).map_or_else(
-                        || "/usr/local/share/nushell/vendor/autoload".into(),
-                        |prefix| format!("{}/share/nushell/vendor/autoload", prefix),
-                    )
-                }
+    let into_autoload_path_fn = |mut path: PathBuf| {
+        path.push("nushell");
+        path.push("vendor");
+        path.push("autoload");
+        path
+    };
+
+    let mut dirs = Vec::new();
+
+    let mut append_fn = |path: PathBuf| {
+        if !dirs.contains(&path) {
+            dirs.push(path)
+        }
+    };
+
+    #[cfg(target_os = "macos")]
+    std::iter::once("/Library/Application Support")
+        .map(PathBuf::from)
+        .map(into_autoload_path_fn)
+        .for_each(&mut append_fn);
+    #[cfg(unix)]
+    {
+        use std::os::unix::ffi::OsStrExt;
+
+        std::env::var_os("XDG_DATA_DIRS")
+            .or_else(|| {
+                option_env!("PREFIX").map(|prefix| {
+                    if prefix.ends_with("local") {
+                        std::ffi::OsString::from(format!("{prefix}/share"))
+                    } else {
+                        std::ffi::OsString::from(format!("{prefix}/local/share:{prefix}/share"))
+                    }
+                })
             })
-            .into(),
-    )
+            .unwrap_or_else(|| std::ffi::OsString::from("/usr/local/share/:/usr/share/"))
+            .as_encoded_bytes()
+            .split(|b| *b == b':')
+            .map(|split| into_autoload_path_fn(PathBuf::from(std::ffi::OsStr::from_bytes(split))))
+            .rev()
+            .for_each(&mut append_fn);
+    }
+
+    #[cfg(target_os = "windows")]
+    dirs_sys::known_folder(windows_sys::Win32::UI::Shell::FOLDERID_ProgramData)
+        .into_iter()
+        .map(into_autoload_path_fn)
+        .for_each(&mut append_fn);
+
+    option_env!("NU_VENDOR_AUTOLOAD_DIR")
+        .into_iter()
+        .map(PathBuf::from)
+        .for_each(&mut append_fn);
+
+    #[cfg(target_os = "macos")]
+    std::env::var("XDG_DATA_HOME")
+        .ok()
+        .map(PathBuf::from)
+        .or_else(|| {
+            dirs::home_dir().map(|mut home| {
+                home.push(".local");
+                home.push("share");
+                home
+            })
+        })
+        .map(into_autoload_path_fn)
+        .into_iter()
+        .for_each(&mut append_fn);
+
+    dirs::data_dir()
+        .into_iter()
+        .map(into_autoload_path_fn)
+        .for_each(&mut append_fn);
+
+    std::env::var_os("NU_VENDOR_AUTOLOAD_DIR")
+        .into_iter()
+        .map(PathBuf::from)
+        .for_each(&mut append_fn);
+
+    dirs
 }
 
 fn eval_const_call(
diff --git a/src/config_files.rs b/src/config_files.rs
index 30977a6d0e..cf72819600 100644
--- a/src/config_files.rs
+++ b/src/config_files.rs
@@ -200,8 +200,7 @@ pub(crate) fn read_vendor_autoload_files(engine_state: &mut EngineState, stack:
         column!()
     );
 
-    // read and source vendor_autoload_files file if exists
-    if let Some(autoload_dir) = nu_protocol::eval_const::get_vendor_autoload_dir(engine_state) {
+    for autoload_dir in nu_protocol::eval_const::get_vendor_autoload_dirs(engine_state) {
         warn!("read_vendor_autoload_files: {}", autoload_dir.display());
 
         if autoload_dir.exists() {

From aa9a42776bd647c513815a23aa08a1693cdf0f41 Mon Sep 17 00:00:00 2001
From: Devyn Cairns <devyn.cairns@gmail.com>
Date: Fri, 19 Jul 2024 05:12:19 -0700
Subject: [PATCH 08/14] Make `ast::Call::span()` and `arguments_span()` more
 robust (#13412)

# Description

Trying to help @amtoine out here - the spans calculated by `ast::Call`
by `span()` and `argument_span()` are suspicious and are not properly
guarding against `end` that might be before `start`. Just using
`Span::merge()` / `merge_many()` instead to try to make the behaviour as
simple and consistent as possible. Hopefully then even if the arguments
have some crazy spans, we don't have spans that are just totally
invalid.

# Tests + Formatting
I did check that everything passes with this.
---
 crates/nu-protocol/src/ast/call.rs | 38 +++++-------------------------
 1 file changed, 6 insertions(+), 32 deletions(-)

diff --git a/crates/nu-protocol/src/ast/call.rs b/crates/nu-protocol/src/ast/call.rs
index d26d0af300..0f7b25e21f 100644
--- a/crates/nu-protocol/src/ast/call.rs
+++ b/crates/nu-protocol/src/ast/call.rs
@@ -110,17 +110,11 @@ impl Call {
     /// If there are one or more arguments the span encompasses the start of the first argument to
     /// end of the last argument
     pub fn arguments_span(&self) -> Span {
-        let past = self.head.past();
-
-        let start = self
-            .arguments
-            .first()
-            .map(|a| a.span())
-            .unwrap_or(past)
-            .start;
-        let end = self.arguments.last().map(|a| a.span()).unwrap_or(past).end;
-
-        Span::new(start, end)
+        if self.arguments.is_empty() {
+            self.head.past()
+        } else {
+            Span::merge_many(self.arguments.iter().map(|a| a.span()))
+        }
     }
 
     pub fn named_iter(
@@ -341,27 +335,7 @@ impl Call {
     }
 
     pub fn span(&self) -> Span {
-        let mut span = self.head;
-
-        for positional in self.positional_iter() {
-            if positional.span.end > span.end {
-                span.end = positional.span.end;
-            }
-        }
-
-        for (named, _, val) in self.named_iter() {
-            if named.span.end > span.end {
-                span.end = named.span.end;
-            }
-
-            if let Some(val) = &val {
-                if val.span.end > span.end {
-                    span.end = val.span.end;
-                }
-            }
-        }
-
-        span
+        self.head.merge(self.arguments_span())
     }
 }
 

From dbd60ed4f46fd4abbb675aac83195f8668978b72 Mon Sep 17 00:00:00 2001
From: suimong <bennyyjx@gmail.com>
Date: Sat, 20 Jul 2024 01:55:38 +0800
Subject: [PATCH 09/14] Tiny make up to the documentation of `reduce` (#13408)

This tiny PR improves the documentation of the `reduce` command by
explicitly stating the direction of reduction, i.e. from left to right,
and adds an example for demonstration.


---------

Co-authored-by: Ben Yang <ben@ya.ng>
---
 crates/nu-command/src/filters/reduce.rs | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/crates/nu-command/src/filters/reduce.rs b/crates/nu-command/src/filters/reduce.rs
index 87fbe3b23e..824e7709e5 100644
--- a/crates/nu-command/src/filters/reduce.rs
+++ b/crates/nu-command/src/filters/reduce.rs
@@ -36,7 +36,7 @@ impl Command for Reduce {
     }
 
     fn usage(&self) -> &str {
-        "Aggregate a list to a single value using an accumulator closure."
+        "Aggregate a list (starting from the left) to a single value using an accumulator closure."
     }
 
     fn search_terms(&self) -> Vec<&str> {
@@ -50,6 +50,11 @@ impl Command for Reduce {
                 description: "Sum values of a list (same as 'math sum')",
                 result: Some(Value::test_int(10)),
             },
+            Example {
+                example: "[ 1 2 3 4 ] | reduce {|it, acc| $acc - $it }",
+                description: r#"`reduce` accumulates value from left to right, equivalent to (((1 - 2) - 3) - 4)."#,
+                result: Some(Value::test_int(-8)),
+            },
             Example {
                 example:
                     "[ 8 7 6 ] | enumerate | reduce --fold 0 {|it, acc| $acc + $it.item + $it.index }",
@@ -61,6 +66,11 @@ impl Command for Reduce {
                 description: "Sum values with a starting value (fold)",
                 result: Some(Value::test_int(20)),
             },
+            Example {
+                example: r#"[[foo baz] [baz quux]] | reduce --fold "foobar" {|it, acc| $acc | str replace $it.0 $it.1}"#,
+                description: "Iteratively perform string replace (from left to right): 'foobar' -> 'bazbar' -> 'quuxbar'",
+                result: Some(Value::test_string("quuxbar")),
+            },
             Example {
                 example: r#"[ i o t ] | reduce --fold "Arthur, King of the Britons" {|it, acc| $acc | str replace --all $it "X" }"#,
                 description: "Replace selected characters in a string with 'X'",

From 5db57abc7d813c7b0bb56709379ea82a4b85b4eb Mon Sep 17 00:00:00 2001
From: Zoey Hewll <Zoybean@users.noreply.github.com>
Date: Sat, 20 Jul 2024 18:34:27 +0800
Subject: [PATCH 10/14] add --quiet flag to `watch` command (#13415)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

# Description
Adds a `--quiet` flag to the `watch` command, silencing the message
usually shown when invoking `watch`.

Resolves #13411

As implemented, `--quiet` is orthogonal to `--verbose`. I'm open to
improving the flag's name, behaviour and/or documentation to make it
more user-friendly, as I realise this may be surprising as-is.

```
> watch path {|a b c| echo $a $b $c}
Now watching files at "/home/user/path". Press ctrl+c to abort.
───┬───────────────────────
 0 │ Remove
 1 │ /home/user/path
 2 │
───┴───────────────────────
^C
> watch --quiet path {|a b c| echo $a $b $c}
───┬───────────────────────
 0 │ Remove
 1 │ /home/user/path
 2 │
───┴───────────────────────
^C
```

# User-Facing Changes

Adds `--quiet`/`-q` flag to `watch`, which removes the initial status
message.
---
 crates/nu-command/src/filesystem/watch.rs | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/crates/nu-command/src/filesystem/watch.rs b/crates/nu-command/src/filesystem/watch.rs
index c9817de250..e6d86357b8 100644
--- a/crates/nu-command/src/filesystem/watch.rs
+++ b/crates/nu-command/src/filesystem/watch.rs
@@ -61,6 +61,7 @@ impl Command for Watch {
                 "Watch all directories under `<path>` recursively. Will be ignored if `<path>` is a file (default: true)",
                 Some('r'),
             )
+            .switch("quiet", "Hide the initial status message (default: false)", Some('q'))
             .switch("verbose", "Operate in verbose mode (default: false)", Some('v'))
             .category(Category::FileSystem)
     }
@@ -94,6 +95,8 @@ impl Command for Watch {
 
         let verbose = call.has_flag(engine_state, stack, "verbose")?;
 
+        let quiet = call.has_flag(engine_state, stack, "quiet")?;
+
         let debounce_duration_flag: Option<Spanned<i64>> =
             call.get_flag(engine_state, stack, "debounce-ms")?;
         let debounce_duration = match debounce_duration_flag {
@@ -161,7 +164,9 @@ impl Command for Watch {
         // need to cache to make sure that rename event works.
         debouncer.cache().add_root(&path, recursive_mode);
 
-        eprintln!("Now watching files at {path:?}. Press ctrl+c to abort.");
+        if !quiet {
+            eprintln!("Now watching files at {path:?}. Press ctrl+c to abort.");
+        }
 
         let mut closure = ClosureEval::new(engine_state, stack, closure);
 

From 01891d637d7b28cff1985bc3211dc4f544996c64 Mon Sep 17 00:00:00 2001
From: Devyn Cairns <devyn.cairns@gmail.com>
Date: Sun, 21 Jul 2024 01:32:36 -0700
Subject: [PATCH 11/14] Make parsing for unknown args in known externals like
 normal external calls (#13414)

# Description

This corrects the parsing of unknown arguments provided to known
externals to behave exactly like external arguments passed to normal
external calls.

I've done this by adding a `SyntaxShape::ExternalArgument` which
triggers the same parsing rules.

Because I didn't like how the highlighting looked, I modified the
flattener to emit `ExternalArg` flat shapes for arguments that have that
syntax shape and are plain strings/globs. This is the same behavior that
external calls have.

Aside from passing the tests, I've also checked manually that the
completer seems to work adequately. I can confirm that specified
positional arguments get completion according to their specified type
(including custom completions), and then anything remaining gets
filepath style completion, as you'd expect from an external command.

Thanks to @OJarrisonn for originally finding this issue.

# User-Facing Changes

- Unknown args are now parsed according to their specified syntax shape,
rather than `Any`. This may be a breaking change, though I think it's
extremely unlikely in practice.
- The unspecified arguments of known externals are now highlighted /
flattened identically to normal external arguments, which makes it more
clear how they're being interpreted, and should help the completer
function properly.
- Known externals now have an implicit rest arg if not specified named
`args`, with a syntax shape of `ExternalArgument`.

# Tests + Formatting
Tests added for the new behaviour. Some old tests had to be corrected to
match.

- :green_circle: `toolkit fmt`
- :green_circle: `toolkit clippy`
- :green_circle: `toolkit test`
- :green_circle: `toolkit test stdlib`

# After Submitting
- [ ] release notes (bugfix, and debatable whether it's a breaking
change)
---
 crates/nu-parser/src/flatten.rs        | 46 +++++++++++++++++++++--
 crates/nu-parser/src/parse_keywords.rs | 10 +++++
 crates/nu-parser/src/parser.rs         | 51 +++++++++++++++++++++-----
 crates/nu-protocol/src/syntax_shape.rs |  8 ++++
 tests/repl/test_custom_commands.rs     |  8 +++-
 tests/repl/test_known_external.rs      | 36 ++++++++++++++++++
 6 files changed, 144 insertions(+), 15 deletions(-)

diff --git a/crates/nu-parser/src/flatten.rs b/crates/nu-parser/src/flatten.rs
index 8a05edaef4..2f71ccea28 100644
--- a/crates/nu-parser/src/flatten.rs
+++ b/crates/nu-parser/src/flatten.rs
@@ -5,7 +5,7 @@ use nu_protocol::{
         RecordItem,
     },
     engine::StateWorkingSet,
-    DeclId, Span, VarId,
+    DeclId, Span, SyntaxShape, VarId,
 };
 use std::fmt::{Display, Formatter, Result};
 
@@ -166,6 +166,22 @@ fn flatten_pipeline_element_into(
     }
 }
 
+fn flatten_positional_arg_into(
+    working_set: &StateWorkingSet,
+    positional: &Expression,
+    shape: &SyntaxShape,
+    output: &mut Vec<(Span, FlatShape)>,
+) {
+    if matches!(shape, SyntaxShape::ExternalArgument)
+        && matches!(positional.expr, Expr::String(..) | Expr::GlobPattern(..))
+    {
+        // Make known external arguments look more like external arguments
+        output.push((positional.span, FlatShape::ExternalArg));
+    } else {
+        flatten_expression_into(working_set, positional, output)
+    }
+}
+
 fn flatten_expression_into(
     working_set: &StateWorkingSet,
     expr: &Expression,
@@ -249,16 +265,40 @@ fn flatten_expression_into(
             }
         }
         Expr::Call(call) => {
+            let decl = working_set.get_decl(call.decl_id);
+
             if call.head.end != 0 {
                 // Make sure we don't push synthetic calls
                 output.push((call.head, FlatShape::InternalCall(call.decl_id)));
             }
 
+            // Follow positional arguments from the signature.
+            let signature = decl.signature();
+            let mut positional_args = signature
+                .required_positional
+                .iter()
+                .chain(&signature.optional_positional);
+
             let arg_start = output.len();
             for arg in &call.arguments {
                 match arg {
-                    Argument::Positional(positional) | Argument::Unknown(positional) => {
-                        flatten_expression_into(working_set, positional, output)
+                    Argument::Positional(positional) => {
+                        let positional_arg = positional_args.next();
+                        let shape = positional_arg
+                            .or(signature.rest_positional.as_ref())
+                            .map(|arg| &arg.shape)
+                            .unwrap_or(&SyntaxShape::Any);
+
+                        flatten_positional_arg_into(working_set, positional, shape, output)
+                    }
+                    Argument::Unknown(positional) => {
+                        let shape = signature
+                            .rest_positional
+                            .as_ref()
+                            .map(|arg| &arg.shape)
+                            .unwrap_or(&SyntaxShape::Any);
+
+                        flatten_positional_arg_into(working_set, positional, shape, output)
                     }
                     Argument::Named(named) => {
                         if named.0.span.end != 0 {
diff --git a/crates/nu-parser/src/parse_keywords.rs b/crates/nu-parser/src/parse_keywords.rs
index 4ac1d20699..7c41f8a78d 100644
--- a/crates/nu-parser/src/parse_keywords.rs
+++ b/crates/nu-parser/src/parse_keywords.rs
@@ -783,6 +783,16 @@ pub fn parse_extern(
                         working_set.get_block_mut(block_id).signature = signature;
                     }
                 } else {
+                    if signature.rest_positional.is_none() {
+                        // Make sure that a known external takes rest args with ExternalArgument
+                        // shape
+                        *signature = signature.rest(
+                            "args",
+                            SyntaxShape::ExternalArgument,
+                            "all other arguments to the command",
+                        );
+                    }
+
                     let decl = KnownExternal {
                         name: external_name,
                         usage,
diff --git a/crates/nu-parser/src/parser.rs b/crates/nu-parser/src/parser.rs
index bc48b9bd0c..1fb0aee01d 100644
--- a/crates/nu-parser/src/parser.rs
+++ b/crates/nu-parser/src/parser.rs
@@ -221,6 +221,22 @@ pub(crate) fn check_call(
     }
 }
 
+/// Parses an unknown argument for the given signature. This handles the parsing as appropriate to
+/// the rest type of the command.
+fn parse_unknown_arg(
+    working_set: &mut StateWorkingSet,
+    span: Span,
+    signature: &Signature,
+) -> Expression {
+    let shape = signature
+        .rest_positional
+        .as_ref()
+        .map(|arg| arg.shape.clone())
+        .unwrap_or(SyntaxShape::Any);
+
+    parse_value(working_set, span, &shape)
+}
+
 /// Parses a string in the arg or head position of an external call.
 ///
 /// If the string begins with `r#`, it is parsed as a raw string. If it doesn't contain any quotes
@@ -427,11 +443,7 @@ fn parse_external_string(working_set: &mut StateWorkingSet, span: Span) -> Expre
 fn parse_external_arg(working_set: &mut StateWorkingSet, span: Span) -> ExternalArgument {
     let contents = working_set.get_span_contents(span);
 
-    if contents.starts_with(b"$") || contents.starts_with(b"(") {
-        ExternalArgument::Regular(parse_dollar_expr(working_set, span))
-    } else if contents.starts_with(b"[") {
-        ExternalArgument::Regular(parse_list_expression(working_set, span, &SyntaxShape::Any))
-    } else if contents.len() > 3
+    if contents.len() > 3
         && contents.starts_with(b"...")
         && (contents[3] == b'$' || contents[3] == b'[' || contents[3] == b'(')
     {
@@ -441,7 +453,19 @@ fn parse_external_arg(working_set: &mut StateWorkingSet, span: Span) -> External
             &SyntaxShape::List(Box::new(SyntaxShape::Any)),
         ))
     } else {
-        ExternalArgument::Regular(parse_external_string(working_set, span))
+        ExternalArgument::Regular(parse_regular_external_arg(working_set, span))
+    }
+}
+
+fn parse_regular_external_arg(working_set: &mut StateWorkingSet, span: Span) -> Expression {
+    let contents = working_set.get_span_contents(span);
+
+    if contents.starts_with(b"$") || contents.starts_with(b"(") {
+        parse_dollar_expr(working_set, span)
+    } else if contents.starts_with(b"[") {
+        parse_list_expression(working_set, span, &SyntaxShape::Any)
+    } else {
+        parse_external_string(working_set, span)
     }
 }
 
@@ -998,7 +1022,7 @@ pub fn parse_internal_call(
                 && signature.allows_unknown_args
             {
                 working_set.parse_errors.truncate(starting_error_count);
-                let arg = parse_value(working_set, arg_span, &SyntaxShape::Any);
+                let arg = parse_unknown_arg(working_set, arg_span, &signature);
 
                 call.add_unknown(arg);
             } else {
@@ -1040,7 +1064,7 @@ pub fn parse_internal_call(
                 && signature.allows_unknown_args
             {
                 working_set.parse_errors.truncate(starting_error_count);
-                let arg = parse_value(working_set, arg_span, &SyntaxShape::Any);
+                let arg = parse_unknown_arg(working_set, arg_span, &signature);
 
                 call.add_unknown(arg);
             } else {
@@ -1135,6 +1159,10 @@ pub fn parse_internal_call(
                     call.add_positional(Expression::garbage(working_set, arg_span));
                 } else {
                     let rest_shape = match &signature.rest_positional {
+                        Some(arg) if matches!(arg.shape, SyntaxShape::ExternalArgument) => {
+                            // External args aren't parsed inside lists in spread position.
+                            SyntaxShape::Any
+                        }
                         Some(arg) => arg.shape.clone(),
                         None => SyntaxShape::Any,
                     };
@@ -1196,7 +1224,7 @@ pub fn parse_internal_call(
             call.add_positional(arg);
             positional_idx += 1;
         } else if signature.allows_unknown_args {
-            let arg = parse_value(working_set, arg_span, &SyntaxShape::Any);
+            let arg = parse_unknown_arg(working_set, arg_span, &signature);
 
             call.add_unknown(arg);
         } else {
@@ -4670,7 +4698,8 @@ pub fn parse_value(
             | SyntaxShape::Signature
             | SyntaxShape::Filepath
             | SyntaxShape::String
-            | SyntaxShape::GlobPattern => {}
+            | SyntaxShape::GlobPattern
+            | SyntaxShape::ExternalArgument => {}
             _ => {
                 working_set.error(ParseError::Expected("non-[] value", span));
                 return Expression::garbage(working_set, span);
@@ -4747,6 +4776,8 @@ pub fn parse_value(
             Expression::garbage(working_set, span)
         }
 
+        SyntaxShape::ExternalArgument => parse_regular_external_arg(working_set, span),
+
         SyntaxShape::Any => {
             if bytes.starts_with(b"[") {
                 //parse_value(working_set, span, &SyntaxShape::Table)
diff --git a/crates/nu-protocol/src/syntax_shape.rs b/crates/nu-protocol/src/syntax_shape.rs
index bece58d59a..66bd77ddca 100644
--- a/crates/nu-protocol/src/syntax_shape.rs
+++ b/crates/nu-protocol/src/syntax_shape.rs
@@ -47,6 +47,12 @@ pub enum SyntaxShape {
     /// A general expression, eg `1 + 2` or `foo --bar`
     Expression,
 
+    /// A (typically) string argument that follows external command argument parsing rules.
+    ///
+    /// Filepaths are expanded if unquoted, globs are allowed, and quotes embedded within unknown
+    /// args are unquoted.
+    ExternalArgument,
+
     /// A filepath is allowed
     Filepath,
 
@@ -145,6 +151,7 @@ impl SyntaxShape {
             SyntaxShape::DateTime => Type::Date,
             SyntaxShape::Duration => Type::Duration,
             SyntaxShape::Expression => Type::Any,
+            SyntaxShape::ExternalArgument => Type::Any,
             SyntaxShape::Filepath => Type::String,
             SyntaxShape::Directory => Type::String,
             SyntaxShape::Float => Type::Float,
@@ -238,6 +245,7 @@ impl Display for SyntaxShape {
             SyntaxShape::Signature => write!(f, "signature"),
             SyntaxShape::MatchBlock => write!(f, "match-block"),
             SyntaxShape::Expression => write!(f, "expression"),
+            SyntaxShape::ExternalArgument => write!(f, "external-argument"),
             SyntaxShape::Boolean => write!(f, "bool"),
             SyntaxShape::Error => write!(f, "error"),
             SyntaxShape::CompleterWrapper(x, _) => write!(f, "completable<{x}>"),
diff --git a/tests/repl/test_custom_commands.rs b/tests/repl/test_custom_commands.rs
index 1f36b0e90c..1710b35913 100644
--- a/tests/repl/test_custom_commands.rs
+++ b/tests/repl/test_custom_commands.rs
@@ -186,8 +186,12 @@ fn help_present_in_def() -> TestResult {
 #[test]
 fn help_not_present_in_extern() -> TestResult {
     run_test(
-        "module test {export extern \"git fetch\" []}; use test `git fetch`; help git fetch | ansi strip",
-        "Usage:\n  > git fetch",
+        r#"
+            module test {export extern "git fetch" []};
+            use test `git fetch`;
+            help git fetch | find help | to text | ansi strip
+        "#,
+        "",
     )
 }
 
diff --git a/tests/repl/test_known_external.rs b/tests/repl/test_known_external.rs
index 96ea677d37..c0278cc863 100644
--- a/tests/repl/test_known_external.rs
+++ b/tests/repl/test_known_external.rs
@@ -137,3 +137,39 @@ fn known_external_aliased_subcommand_from_module() -> TestResult {
         String::from_utf8(output.stdout)?.trim(),
     )
 }
+
+#[test]
+fn known_external_arg_expansion() -> TestResult {
+    run_test(
+        r#"
+            extern echo [];
+            echo ~/foo
+        "#,
+        &dirs::home_dir()
+            .expect("can't find home dir")
+            .join("foo")
+            .to_string_lossy(),
+    )
+}
+
+#[test]
+fn known_external_arg_quoted_no_expand() -> TestResult {
+    run_test(
+        r#"
+            extern echo [];
+            echo "~/foo"
+        "#,
+        "~/foo",
+    )
+}
+
+#[test]
+fn known_external_arg_internally_quoted_options() -> TestResult {
+    run_test(
+        r#"
+            extern echo [];
+            echo --option="test"
+        "#,
+        "--option=test",
+    )
+}

From 9ab706db627df18b19eee727c39d6473892de7dd Mon Sep 17 00:00:00 2001
From: Wenbo Chen <wenbo.chen86@gmail.com>
Date: Sun, 21 Jul 2024 20:30:14 +0900
Subject: [PATCH 12/14] Fix output format borken in `char --list` (#13417)

# Description

Resolve #13395

When we use the `char --list` command, it also outputs the characters
*bel* and *backspace*, which breaks the table's alignment.
![Screenshot from 2024-07-21
16-10-03](https://github.com/user-attachments/assets/54561cbc-f1a1-4ccd-9561-65dccc939280)
I also found that the behaviour at the beginning of the table is not
correct because of some line change characters, as I mentioned in the
issue discussion.

So, the solution is to create a list containing the characters that do
not need to be output. When the name is in the list, we output the
character as an empty string. Here is the behaviour after fixing.
![Screenshot from 2024-07-21
16-16-08](https://github.com/user-attachments/assets/dd3345a3-6a72-4c90-b331-bc95dc6db66a)
![Screenshot from 2024-07-21
16-16-39](https://github.com/user-attachments/assets/57dc5073-7f8d-40c4-9830-36eba23075e6)
---
 crates/nu-command/src/strings/char_.rs | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/crates/nu-command/src/strings/char_.rs b/crates/nu-command/src/strings/char_.rs
index 7737210254..aa987438d2 100644
--- a/crates/nu-command/src/strings/char_.rs
+++ b/crates/nu-command/src/strings/char_.rs
@@ -3,6 +3,7 @@ use nu_engine::command_prelude::*;
 
 use nu_protocol::Signals;
 use once_cell::sync::Lazy;
+use std::collections::HashSet;
 
 // Character used to separate directories in a Path Environment variable on windows is ";"
 #[cfg(target_family = "windows")]
@@ -149,6 +150,24 @@ static CHAR_MAP: Lazy<IndexMap<&'static str, String>> = Lazy::new(|| {
     }
 });
 
+static NO_OUTPUT_CHARS: Lazy<HashSet<&'static str>> = Lazy::new(|| {
+    [
+        // If the character is in the this set, we don't output it to prevent
+        // the broken of `char --list` command table format and alignment.
+        "newline",
+        "enter",
+        "nl",
+        "line_feed",
+        "lf",
+        "cr",
+        "crlf",
+        "bel",
+        "backspace",
+    ]
+    .into_iter()
+    .collect()
+});
+
 impl Command for Char {
     fn name(&self) -> &str {
         "char"
@@ -297,6 +316,11 @@ fn generate_character_list(signals: Signals, call_span: Span) -> PipelineData {
     CHAR_MAP
         .iter()
         .map(move |(name, s)| {
+            let character = if NO_OUTPUT_CHARS.contains(name) {
+                Value::string("", call_span)
+            } else {
+                Value::string(s, call_span)
+            };
             let unicode = Value::string(
                 s.chars()
                     .map(|c| format!("{:x}", c as u32))
@@ -306,7 +330,7 @@ fn generate_character_list(signals: Signals, call_span: Span) -> PipelineData {
             );
             let record = record! {
                 "name" => Value::string(*name, call_span),
-                "character" => Value::string(s, call_span),
+                "character" => character,
                 "unicode" => unicode,
             };
 

From 6fcd09682cbaa1e2b2077e023c5ea351ab3a6f00 Mon Sep 17 00:00:00 2001
From: Ian Manske <ian.manske@pm.me>
Date: Sun, 21 Jul 2024 21:15:36 +0000
Subject: [PATCH 13/14] Fix setting metadata on byte streams (#13416)

# Description
Setting metadata on a byte stream converts it to a list stream. This PR
makes it so that `metadata set` does not alter its input besides the
metadata.

```nushell
open --raw test.json
| metadata set --content-type application/json
| http post https://httpbin.org/post
```
---
 crates/nu-command/src/debug/metadata_set.rs | 40 +++++++++++----------
 1 file changed, 21 insertions(+), 19 deletions(-)

diff --git a/crates/nu-command/src/debug/metadata_set.rs b/crates/nu-command/src/debug/metadata_set.rs
index 96f50cfdba..c78a5df5c3 100644
--- a/crates/nu-command/src/debug/metadata_set.rs
+++ b/crates/nu-command/src/debug/metadata_set.rs
@@ -42,32 +42,32 @@ impl Command for MetadataSet {
         engine_state: &EngineState,
         stack: &mut Stack,
         call: &Call,
-        input: PipelineData,
+        mut input: PipelineData,
     ) -> Result<PipelineData, ShellError> {
         let head = call.head;
         let ds_fp: Option<String> = call.get_flag(engine_state, stack, "datasource-filepath")?;
         let ds_ls = call.has_flag(engine_state, stack, "datasource-ls")?;
         let content_type: Option<String> = call.get_flag(engine_state, stack, "content-type")?;
-        let signals = engine_state.signals().clone();
-        let metadata = input
-            .metadata()
-            .clone()
-            .unwrap_or_default()
-            .with_content_type(content_type);
+
+        let mut metadata = match &mut input {
+            PipelineData::Value(_, metadata)
+            | PipelineData::ListStream(_, metadata)
+            | PipelineData::ByteStream(_, metadata) => metadata.take().unwrap_or_default(),
+            PipelineData::Empty => return Err(ShellError::PipelineEmpty { dst_span: head }),
+        };
+
+        if let Some(content_type) = content_type {
+            metadata.content_type = Some(content_type);
+        }
 
         match (ds_fp, ds_ls) {
-            (Some(path), false) => Ok(input.into_pipeline_data_with_metadata(
-                head,
-                signals,
-                metadata.with_data_source(DataSource::FilePath(path.into())),
-            )),
-            (None, true) => Ok(input.into_pipeline_data_with_metadata(
-                head,
-                signals,
-                metadata.with_data_source(DataSource::Ls),
-            )),
-            _ => Ok(input.into_pipeline_data_with_metadata(head, signals, metadata)),
+            (Some(path), false) => metadata.data_source = DataSource::FilePath(path.into()),
+            (None, true) => metadata.data_source = DataSource::Ls,
+            (Some(_), true) => (), // TODO: error here
+            (None, false) => (),
         }
+
+        Ok(input.set_metadata(Some(metadata)))
     }
 
     fn examples(&self) -> Vec<Example> {
@@ -85,7 +85,9 @@ impl Command for MetadataSet {
             Example {
                 description: "Set the metadata of a file path",
                 example: "'crates' | metadata set --content-type text/plain | metadata",
-                result: Some(Value::record(record!("content_type" => Value::string("text/plain", Span::test_data())), Span::test_data())),
+                result: Some(Value::test_record(record! {
+                    "content_type" => Value::test_string("text/plain"),
+                })),
             },
         ]
     }

From 366e52b76dfda146dbc2bbdfda2d710169786d9b Mon Sep 17 00:00:00 2001
From: Darren Schroeder <343840+fdncred@users.noreply.github.com>
Date: Sun, 21 Jul 2024 18:42:11 -0500
Subject: [PATCH 14/14] Update query web example since wikipedia keeps changing
 (#13421)

# Description

Every so ofter wikipedia changes the column names which breaks the query
example. It would be good to make query web's table extraction to be
smart enough to find tables that are close. This PR fixes the example.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use toolkit.nu; toolkit test stdlib"` to run the
tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->
---
 crates/nu_plugin_query/src/query_web.rs | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/crates/nu_plugin_query/src/query_web.rs b/crates/nu_plugin_query/src/query_web.rs
index 163b320e16..39f50e51af 100644
--- a/crates/nu_plugin_query/src/query_web.rs
+++ b/crates/nu_plugin_query/src/query_web.rs
@@ -76,7 +76,7 @@ pub fn web_examples() -> Vec<Example<'static>> {
         },
         Example {
             example: "http get https://en.wikipedia.org/wiki/List_of_cities_in_India_by_population |
-        query web --as-table [City 'Population(2011)[3]' 'Population(2001)[3][a]' 'State or unionterritory' 'Ref']",
+        query web --as-table [City 'Population(2011)[3]' 'Population(2001)[3][a]' 'State or unionterritory' 'Reference']",
             description: "Retrieve a html table from Wikipedia and parse it into a nushell table using table headers as guides",
             result: None
         },