nushell/crates/nu-cli/src/syntax_highlight.rs
Yash Thakur 21b3eeed99
Allow spreading arguments to commands (#11289)
<!--
if this PR closes one or more issues, you can automatically link the PR
with
them by using one of the [*linking
keywords*](https://docs.github.com/en/issues/tracking-your-work-with-issues/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword),
e.g.
- this PR should close #xxxx
- fixes #xxxx

you can also mention related issues, PRs or discussions!
-->

Finishes implementing https://github.com/nushell/nushell/issues/10598,
which asks for a spread operator in lists, in records, and when calling
commands.

# Description
<!--
Thank you for improving Nushell. Please, check our [contributing
guide](../CONTRIBUTING.md) and talk to the core team before making major
changes.

Description of your pull request goes here. **Provide examples and/or
screenshots** if your changes affect the user experience.
-->

This PR will allow spreading arguments to commands (both internal and
external). It will also deprecate spreading arguments automatically when
passing to external commands.

# User-Facing Changes
<!-- List of all changes that impact the user experience here. This
helps us keep track of breaking changes. -->

- Users will be able to use `...` to spread arguments to custom/builtin
commands that have rest parameters or allow unknown arguments, or to any
external command
- If a custom command doesn't have a rest parameter and it doesn't allow
unknown arguments either, the spread operator will not be allowed
- Passing lists to external commands without `...` will work for now but
will cause a deprecation warning saying that it'll stop working in 0.91
(is 2 versions enough time?)

Here's a function to help with demonstrating some behavior:
```nushell
> def foo [ a, b, c?, d?, ...rest ] { [$a $b $c $d $rest] | to nuon }
```

You can pass a list of arguments to fill in the `rest` parameter using
`...`:
```nushell
> foo 1 2 3 4 ...[5 6]
[1, 2, 3, 4, [5, 6]]
```

If you don't use `...`, the list `[5 6]` will be treated as a single
argument:

```nushell
> foo 1 2 3 4 [5 6] # Note the double [[]]
[1, 2, 3, 4, [[5, 6]]]
```

You can omit optional parameters before the spread arguments:
```nushell
> foo 1 2 3 ...[4 5] # d is omitted here
[1, 2, 3, null, [4, 5]]
```

If you have multiple lists, you can spread them all:
```nushell
> foo 1 2 3 ...[4 5] 6 7 ...[8] ...[]
[1, 2, 3, null, [4, 5, 6, 7, 8]]
```

Here's the kind of error you get when you try to spread arguments to a
command with no rest parameter:

![image](https://github.com/nushell/nushell/assets/45539777/93faceae-00eb-4e59-ac3f-17f98436e6e4)

And this is the warning you get when you pass a list to an external now
(without `...`):


![image](https://github.com/nushell/nushell/assets/45539777/d368f590-201e-49fb-8b20-68476ced415e)


# Tests + Formatting
<!--
Don't forget to add tests that cover your changes.

Make sure you've run and fixed any issues with these commands:

- `cargo fmt --all -- --check` to check standard code formatting (`cargo
fmt --all` applies these changes)
- `cargo clippy --workspace -- -D warnings -D clippy::unwrap_used` to
check that you're using the standard code style
- `cargo test --workspace` to check that all tests pass (on Windows make
sure to [enable developer
mode](https://learn.microsoft.com/en-us/windows/apps/get-started/developer-mode-features-and-debugging))
- `cargo run -- -c "use std testing; testing run-tests --path
crates/nu-std"` to run the tests for the standard library

> **Note**
> from `nushell` you can also use the `toolkit` as follows
> ```bash
> use toolkit.nu # or use an `env_change` hook to activate it
automatically
> toolkit check pr
> ```
-->

Added tests to cover the following cases:
- Spreading arguments to a command that doesn't have a rest parameter
(unexpected spread argument error)
- Spreading arguments to a command that doesn't have a rest parameter
*but* there's also a missing positional argument (missing positional
error)
- Spreading arguments to a command that doesn't have a rest parameter
but does allow unknown arguments, such as `exec` (allowed)
- Spreading a list literal containing arguments of the wrong type (parse
error)
- Spreading a non-list value, both to internal and external commands
- Having named arguments in the middle of rest arguments
- `explain`ing a command call that spreads its arguments

# After Submitting
<!-- If your PR had any user-facing changes, update [the
documentation](https://github.com/nushell/nushell.github.io) after the
PR is merged, if necessary. This will help us keep the docs up to date.
-->

# Examples

Suppose you have multiple tables:
```nushell
let people = [[id name age]; [0 alice 100] [1 bob 200] [2 eve 300]]
let evil_twins = [[id name age]; [0 ecila 100] [-1 bob 200] [-2 eve 300]]
```

Maybe you often find yourself needing to merge multiple tables and want
a utility to do that. You could write a function like this:
```nushell
def merge_all [ ...tables ] { $tables | reduce { |it, acc| $acc | merge $it } }
```

Then you can use it like this:
```nushell
> merge_all ...([$people $evil_twins] | each { |$it| $it | select name age })
╭───┬───────┬─────╮
│ # │ name  │ age │
├───┼───────┼─────┤
│ 0 │ ecila │ 100 │
│ 1 │ bob   │ 200 │
│ 2 │ eve   │ 300 │
╰───┴───────┴─────╯
```

Except they had duplicate columns, so now you first want to suffix every
column with a number to tell you which table the column came from. You
can make a command for that:
```nushell
def select_and_merge [ --cols: list<string>, ...tables ] {
  let renamed_tables = $tables
    | enumerate
    | each { |it|
      $it.item | select $cols | rename ...($cols | each { |col| $col + ($it.index | into string) })
    };
  merge_all ...$renamed_tables
}
```
And call it like this:
```nushell
> select_and_merge --cols [name age] $people $evil_twins
╭───┬───────┬──────┬───────┬──────╮
│ # │ name0 │ age0 │ name1 │ age1 │
├───┼───────┼──────┼───────┼──────┤
│ 0 │ alice │  100 │ ecila │  100 │
│ 1 │ bob   │  200 │ bob   │  200 │
│ 2 │ eve   │  300 │ eve   │  300 │
╰───┴───────┴──────┴───────┴──────╯
```

---

Suppose someone's made a command to search for APT packages:

```nushell
# The main command
def search-pkgs [
    --install                   # Whether to install any packages it finds
    log_level: int              # Pretend it's a good idea to make this a required positional parameter
    exclude?: list<string>      # Packages to exclude
    repositories?: list<string> # Which repositories to look in (searches in all if not given)
    ...pkgs                     # Package names to search for
] {
  { install: $install, log_level: $log_level, exclude: ($exclude | to nuon), repositories: ($repositories | to nuon), pkgs: ($pkgs | to nuon) }
}
```

It has a lot of parameters to configure it, so you might make your own
helper commands to wrap around it for specific cases. Here's one
example:
```nushell
# Only look for packages locally
def search-pkgs-local [
    --install              # Whether to install any packages it finds
    log_level: int
    exclude?: list<string> # Packages to exclude
    ...pkgs                # Package names to search for
] {
  # All required and optional positional parameters are given
  search-pkgs --install=$install $log_level [] ["<local URI or something>"] ...$pkgs
}
```
And you can run it like this:
```nushell
> search-pkgs-local --install=false 5 ...["python2.7" "vim"]
╭──────────────┬──────────────────────────────╮
│ install      │ false                        │
│ log_level    │ 5                            │
│ exclude      │ []                           │
│ repositories │ ["<local URI or something>"] │
│ pkgs         │ ["python2.7", vim]           │
╰──────────────┴──────────────────────────────╯
```

One thing I realized when writing this was that if we decide to not
allow passing optional arguments using the spread operator, then you can
(mis?)use the spread operator to skip optional parameters. Here, I
didn't want to give `exclude` explicitly, so I used a spread operator to
pass the packages to install. Without it, I would've needed to do
`search-pkgs-local --install=false 5 [] "python2.7" "vim"` (explicitly
pass `[]` (or `null`, in the general case) to `exclude`). There are
probably more idiomatic ways to do this, but I just thought it was
something interesting.

If you're a virologist of the [xkcd](https://xkcd.com/350/) kind,
another helper command you might make is this:
```nushell
# Install any packages it finds
def live-dangerously [ ...pkgs ] {
  # One optional argument was given (exclude), while another was not (repositories)
  search-pkgs 0 [] ...$pkgs --install # Flags can go after spread arguments
}
```

Running it:
```nushell
> live-dangerously "git" "*vi*" # *vi* because I don't feel like typing out vim and neovim
╭──────────────┬─────────────╮
│ install      │ true        │
│ log_level    │ 0           │
│ exclude      │ []          │
│ repositories │ null        │
│ pkgs         │ [git, *vi*] │
╰──────────────┴─────────────╯
```

Here's an example that uses the spread operator more than once within
the same command call:
```nushell
let extras = [ chrome firefox python java git ]

def search-pkgs-curated [ ...pkgs ] {
  (search-pkgs
      1
      [emacs]
      ["example.com", "foo.com"]
      vim # A must for everyone!
      ...($pkgs | filter { |p| not ($p | str contains "*") }) # Remove packages with globs
      python # Good tool to have
      ...$extras
      --install=false
      python3) # I forget, did I already put Python in extras?
}
```

Running it:
```nushell
> search-pkgs-curated "git" "*vi*"
╭──────────────┬───────────────────────────────────────────────────────────────────╮
│ install      │ false                                                             │
│ log_level    │ 1                                                                 │
│ exclude      │ [emacs]                                                           │
│ repositories │ [example.com, foo.com]                                            │
│ pkgs         │ [vim, git, python, chrome, firefox, python, java, git, "python3"] │
╰──────────────┴───────────────────────────────────────────────────────────────────╯
```
2023-12-28 15:43:20 +08:00

470 lines
18 KiB
Rust

use log::trace;
use nu_ansi_term::Style;
use nu_color_config::{get_matching_brackets_style, get_shape_color};
use nu_parser::{flatten_block, parse, FlatShape};
use nu_protocol::ast::{Argument, Block, Expr, Expression, PipelineElement, RecordItem};
use nu_protocol::engine::{EngineState, StateWorkingSet};
use nu_protocol::{Config, Span};
use reedline::{Highlighter, StyledText};
use std::sync::Arc;
pub struct NuHighlighter {
pub engine_state: Arc<EngineState>,
pub config: Config,
}
impl Highlighter for NuHighlighter {
fn highlight(&self, line: &str, _cursor: usize) -> StyledText {
trace!("highlighting: {}", line);
let highlight_resolved_externals =
self.engine_state.get_config().highlight_resolved_externals;
let mut working_set = StateWorkingSet::new(&self.engine_state);
let block = parse(&mut working_set, None, line.as_bytes(), false);
let (shapes, global_span_offset) = {
let mut shapes = flatten_block(&working_set, &block);
// Highlighting externals has a config point because of concerns that using which to resolve
// externals may slow down things too much.
if highlight_resolved_externals {
for (span, shape) in shapes.iter_mut() {
if *shape == FlatShape::External {
let str_contents =
working_set.get_span_contents(Span::new(span.start, span.end));
let str_word = String::from_utf8_lossy(str_contents).to_string();
if which::which(str_word).ok().is_some() {
*shape = FlatShape::ExternalResolved;
}
}
}
}
(shapes, self.engine_state.next_span_start())
};
let mut output = StyledText::default();
let mut last_seen_span = global_span_offset;
let global_cursor_offset = _cursor + global_span_offset;
let matching_brackets_pos = find_matching_brackets(
line,
&working_set,
&block,
global_span_offset,
global_cursor_offset,
);
for shape in &shapes {
if shape.0.end <= last_seen_span
|| last_seen_span < global_span_offset
|| shape.0.start < global_span_offset
{
// We've already output something for this span
// so just skip this one
continue;
}
if shape.0.start > last_seen_span {
let gap = line
[(last_seen_span - global_span_offset)..(shape.0.start - global_span_offset)]
.to_string();
output.push((Style::new(), gap));
}
let next_token = line
[(shape.0.start - global_span_offset)..(shape.0.end - global_span_offset)]
.to_string();
macro_rules! add_colored_token_with_bracket_highlight {
($shape:expr, $span:expr, $text:expr) => {{
let spans = split_span_by_highlight_positions(
line,
$span,
&matching_brackets_pos,
global_span_offset,
);
spans.iter().for_each(|(part, highlight)| {
let start = part.start - $span.start;
let end = part.end - $span.start;
let text = (&next_token[start..end]).to_string();
let mut style = get_shape_color($shape.to_string(), &self.config);
if *highlight {
style = get_matching_brackets_style(style, &self.config);
}
output.push((style, text));
});
}};
}
let mut add_colored_token = |shape: &FlatShape, text: String| {
output.push((get_shape_color(shape.to_string(), &self.config), text));
};
match shape.1 {
FlatShape::Garbage => add_colored_token(&shape.1, next_token),
FlatShape::Nothing => add_colored_token(&shape.1, next_token),
FlatShape::Binary => add_colored_token(&shape.1, next_token),
FlatShape::Bool => add_colored_token(&shape.1, next_token),
FlatShape::Int => add_colored_token(&shape.1, next_token),
FlatShape::Float => add_colored_token(&shape.1, next_token),
FlatShape::Range => add_colored_token(&shape.1, next_token),
FlatShape::InternalCall(_) => add_colored_token(&shape.1, next_token),
FlatShape::External => add_colored_token(&shape.1, next_token),
FlatShape::ExternalArg => add_colored_token(&shape.1, next_token),
FlatShape::ExternalResolved => add_colored_token(&shape.1, next_token),
FlatShape::Keyword => add_colored_token(&shape.1, next_token),
FlatShape::Literal => add_colored_token(&shape.1, next_token),
FlatShape::Operator => add_colored_token(&shape.1, next_token),
FlatShape::Signature => add_colored_token(&shape.1, next_token),
FlatShape::String => add_colored_token(&shape.1, next_token),
FlatShape::StringInterpolation => add_colored_token(&shape.1, next_token),
FlatShape::DateTime => add_colored_token(&shape.1, next_token),
FlatShape::List => {
add_colored_token_with_bracket_highlight!(shape.1, shape.0, next_token)
}
FlatShape::Table => {
add_colored_token_with_bracket_highlight!(shape.1, shape.0, next_token)
}
FlatShape::Record => {
add_colored_token_with_bracket_highlight!(shape.1, shape.0, next_token)
}
FlatShape::Block => {
add_colored_token_with_bracket_highlight!(shape.1, shape.0, next_token)
}
FlatShape::Closure => {
add_colored_token_with_bracket_highlight!(shape.1, shape.0, next_token)
}
FlatShape::Filepath => add_colored_token(&shape.1, next_token),
FlatShape::Directory => add_colored_token(&shape.1, next_token),
FlatShape::GlobPattern => add_colored_token(&shape.1, next_token),
FlatShape::Variable(_) | FlatShape::VarDecl(_) => {
add_colored_token(&shape.1, next_token)
}
FlatShape::Flag => add_colored_token(&shape.1, next_token),
FlatShape::Pipe => add_colored_token(&shape.1, next_token),
FlatShape::And => add_colored_token(&shape.1, next_token),
FlatShape::Or => add_colored_token(&shape.1, next_token),
FlatShape::Redirection => add_colored_token(&shape.1, next_token),
FlatShape::Custom(..) => add_colored_token(&shape.1, next_token),
FlatShape::MatchPattern => add_colored_token(&shape.1, next_token),
}
last_seen_span = shape.0.end;
}
let remainder = line[(last_seen_span - global_span_offset)..].to_string();
if !remainder.is_empty() {
output.push((Style::new(), remainder));
}
output
}
}
fn split_span_by_highlight_positions(
line: &str,
span: Span,
highlight_positions: &[usize],
global_span_offset: usize,
) -> Vec<(Span, bool)> {
let mut start = span.start;
let mut result: Vec<(Span, bool)> = Vec::new();
for pos in highlight_positions {
if start <= *pos && pos < &span.end {
if start < *pos {
result.push((Span::new(start, *pos), false));
}
let span_str = &line[pos - global_span_offset..span.end - global_span_offset];
let end = span_str
.chars()
.next()
.map(|c| pos + get_char_length(c))
.unwrap_or(pos + 1);
result.push((Span::new(*pos, end), true));
start = end;
}
}
if start < span.end {
result.push((Span::new(start, span.end), false));
}
result
}
fn find_matching_brackets(
line: &str,
working_set: &StateWorkingSet,
block: &Block,
global_span_offset: usize,
global_cursor_offset: usize,
) -> Vec<usize> {
const BRACKETS: &str = "{}[]()";
// calculate first bracket position
let global_end_offset = line.len() + global_span_offset;
let global_bracket_pos =
if global_cursor_offset == global_end_offset && global_end_offset > global_span_offset {
// cursor is at the end of a non-empty string -- find block end at the previous position
if let Some(last_char) = line.chars().last() {
global_cursor_offset - get_char_length(last_char)
} else {
global_cursor_offset
}
} else {
// cursor is in the middle of a string -- find block end at the current position
global_cursor_offset
};
// check that position contains bracket
let match_idx = global_bracket_pos - global_span_offset;
if match_idx >= line.len()
|| !BRACKETS.contains(get_char_at_index(line, match_idx).unwrap_or_default())
{
return Vec::new();
}
// find matching bracket by finding matching block end
let matching_block_end = find_matching_block_end_in_block(
line,
working_set,
block,
global_span_offset,
global_bracket_pos,
);
if let Some(pos) = matching_block_end {
let matching_idx = pos - global_span_offset;
if BRACKETS.contains(get_char_at_index(line, matching_idx).unwrap_or_default()) {
return if global_bracket_pos < pos {
vec![global_bracket_pos, pos]
} else {
vec![pos, global_bracket_pos]
};
}
}
Vec::new()
}
fn find_matching_block_end_in_block(
line: &str,
working_set: &StateWorkingSet,
block: &Block,
global_span_offset: usize,
global_cursor_offset: usize,
) -> Option<usize> {
for p in &block.pipelines {
for e in &p.elements {
match e {
PipelineElement::Expression(_, e)
| PipelineElement::Redirection(_, _, e, _)
| PipelineElement::And(_, e)
| PipelineElement::Or(_, e)
| PipelineElement::SameTargetRedirection { cmd: (_, e), .. }
| PipelineElement::SeparateRedirection { out: (_, e, _), .. } => {
if e.span.contains(global_cursor_offset) {
if let Some(pos) = find_matching_block_end_in_expr(
line,
working_set,
e,
global_span_offset,
global_cursor_offset,
) {
return Some(pos);
}
}
}
}
}
}
None
}
fn find_matching_block_end_in_expr(
line: &str,
working_set: &StateWorkingSet,
expression: &Expression,
global_span_offset: usize,
global_cursor_offset: usize,
) -> Option<usize> {
macro_rules! find_in_expr_or_continue {
($inner_expr:ident) => {
if let Some(pos) = find_matching_block_end_in_expr(
line,
working_set,
$inner_expr,
global_span_offset,
global_cursor_offset,
) {
return Some(pos);
}
};
}
if expression.span.contains(global_cursor_offset) && expression.span.start >= global_span_offset
{
let expr_first = expression.span.start;
let span_str = &line
[expression.span.start - global_span_offset..expression.span.end - global_span_offset];
let expr_last = span_str
.chars()
.last()
.map(|c| expression.span.end - get_char_length(c))
.unwrap_or(expression.span.start);
return match &expression.expr {
Expr::Bool(_) => None,
Expr::Int(_) => None,
Expr::Float(_) => None,
Expr::Binary(_) => None,
Expr::Range(..) => None,
Expr::Var(_) => None,
Expr::VarDecl(_) => None,
Expr::ExternalCall(..) => None,
Expr::Operator(_) => None,
Expr::UnaryNot(_) => None,
Expr::Keyword(..) => None,
Expr::ValueWithUnit(..) => None,
Expr::DateTime(_) => None,
Expr::Filepath(_) => None,
Expr::Directory(_) => None,
Expr::GlobPattern(_) => None,
Expr::String(_) => None,
Expr::CellPath(_) => None,
Expr::ImportPattern(_) => None,
Expr::Overlay(_) => None,
Expr::Signature(_) => None,
Expr::MatchBlock(_) => None,
Expr::Nothing => None,
Expr::Garbage => None,
Expr::Spread(_) => None,
Expr::Table(hdr, rows) => {
if expr_last == global_cursor_offset {
// cursor is at table end
Some(expr_first)
} else if expr_first == global_cursor_offset {
// cursor is at table start
Some(expr_last)
} else {
// cursor is inside table
for inner_expr in hdr {
find_in_expr_or_continue!(inner_expr);
}
for row in rows {
for inner_expr in row {
find_in_expr_or_continue!(inner_expr);
}
}
None
}
}
Expr::Record(exprs) => {
if expr_last == global_cursor_offset {
// cursor is at record end
Some(expr_first)
} else if expr_first == global_cursor_offset {
// cursor is at record start
Some(expr_last)
} else {
// cursor is inside record
for expr in exprs {
match expr {
RecordItem::Pair(k, v) => {
find_in_expr_or_continue!(k);
find_in_expr_or_continue!(v);
}
RecordItem::Spread(_, record) => {
find_in_expr_or_continue!(record);
}
}
}
None
}
}
Expr::Call(call) => {
for arg in &call.arguments {
let opt_expr = match arg {
Argument::Named((_, _, opt_expr)) => opt_expr.as_ref(),
Argument::Positional(inner_expr) => Some(inner_expr),
Argument::Unknown(inner_expr) => Some(inner_expr),
Argument::Spread(inner_expr) => Some(inner_expr),
};
if let Some(inner_expr) = opt_expr {
find_in_expr_or_continue!(inner_expr);
}
}
None
}
Expr::FullCellPath(b) => find_matching_block_end_in_expr(
line,
working_set,
&b.head,
global_span_offset,
global_cursor_offset,
),
Expr::BinaryOp(lhs, op, rhs) => {
find_in_expr_or_continue!(lhs);
find_in_expr_or_continue!(op);
find_in_expr_or_continue!(rhs);
None
}
Expr::Block(block_id)
| Expr::Closure(block_id)
| Expr::RowCondition(block_id)
| Expr::Subexpression(block_id) => {
if expr_last == global_cursor_offset {
// cursor is at block end
Some(expr_first)
} else if expr_first == global_cursor_offset {
// cursor is at block start
Some(expr_last)
} else {
// cursor is inside block
let nested_block = working_set.get_block(*block_id);
find_matching_block_end_in_block(
line,
working_set,
nested_block,
global_span_offset,
global_cursor_offset,
)
}
}
Expr::StringInterpolation(inner_expr) => {
for inner_expr in inner_expr {
find_in_expr_or_continue!(inner_expr);
}
None
}
Expr::List(inner_expr) => {
if expr_last == global_cursor_offset {
// cursor is at list end
Some(expr_first)
} else if expr_first == global_cursor_offset {
// cursor is at list start
Some(expr_last)
} else {
// cursor is inside list
for inner_expr in inner_expr {
find_in_expr_or_continue!(inner_expr);
}
None
}
}
};
}
None
}
fn get_char_at_index(s: &str, index: usize) -> Option<char> {
s[index..].chars().next()
}
fn get_char_length(c: char) -> usize {
c.to_string().len()
}