Robin Malfait 1ef97759e3
Add @source not support (#17255)
This PR adds a new source detection feature: `@source not "…"`. It can
be used to exclude files specifically from your source configuration
without having to think about creating a rule that matches all but the
requested file:

```css
@import "tailwindcss";
@source not "../src/my-tailwind-js-plugin.js";
```

While working on this feature, we noticed that there are multiple places
with different heuristics we used to scan the file system. These are:

- Auto source detection (so the default configuration or an `@source
"./my-dir"`)
- Custom sources ( e.g. `@source "./**/*.bin"` — these contain file
extensions)
- The code to detect updates on the file system

Because of the different heuristics, we were able to construct failing
cases (e.g. when you create a new file into `my-dir` that would be
thrown out by auto-source detection, it'd would actually be scanned). We
were also leaving a lot of performance on the table as the file system
is traversed multiple times for certain problems.

To resolve these issues, we're now unifying all of these systems into
one `ignore` crate walker setup. We also implemented features like
auto-source-detection and the `not` flag as additional _gitignore_ rules
only, avoid the need for a lot of custom code needed to make decisions.

High level, this is what happens after the now:

- We collect all non-negative `@source` rules into a list of _roots_
(that is the source directory for this rule) and optional _globs_ (that
is the actual rules for files in this file). For custom sources (i.e
with a custom `glob`), we add an allowlist rule to the gitignore setup,
so that we can be sure these files are always included.
- For every negative `@source` rule, we create respective ignore rules.
- Furthermore we have a custom filter that ensures files are only read
if they have been changed since the last time they were read.

So, consider the following setup:

```css
/* packages/web/src/index.css */
@import "tailwindcss";
@source "../../lib/ui/**/*.bin";
@source not "../../lib/ui/expensive.bin";
```

This creates a git ignore file that (simplified) looks like this:

```gitignore
# Auto-source rules
*.{exe,node,bin,…}
*.{css,scss,sass,…}
{node_modules,git}/

# Custom sources can overwrite auto-source rules
!lib/ui/**/*.bin

# Negative rules
lib/ui/expensive.bin
```

We then use this information _on top of your existing `.gitignore`
setup_ to resolve files (i.e so if your `.gitignore` contains rules e.g.
`dist/` this line is going to be added _before_ any of the rules lined
out in the example above. This allows negative rules to allow-list your
`.gitignore` rules.

To implement this, we're rely on the `ignore` crate but we had to make
various changes, very specific, to it so we decided to fork the crate.
All changes are prefixed with a `// CHANGED:` block but here are the
most-important ones:

- We added a way to add custom ignore rules that _extend_ (rather than
overwrite) your existing `.gitignore` rules
- We updated the order in which files are resolved and made it so that
more-specific files can allow-list more generic ignore rules.
- We resolved various issues related to adding more than one base path
to the traversal and ensured it works consistent for Linux, macOS, and
Windows.

## Behavioral changes

1. Any custom glob defined via `@source` now wins over your `.gitignore`
file and the auto-content rules.
   - Resolves #16920
3. The `node_modules` and `.git` folders as well as the `.gitignore`
file are now ignored by default (but can be overridden by an explicit
`@source` rule).
   - Resolves #17318
   - Resolves #15882
4. Source paths into ignored-by-default folders (like `node_modules`)
now also win over your `.gitignore` configuration and auto-content
rules.
    -  Resolves #16669
5. Introduced `@source not "…"` to negate any previous rules.
   - Resolves #17058
6. Negative `content` rules in your legacy JavaScript configuration
(e.g. `content: ['!./src']`) now work with v4.
   - Resolves #15943 
7. The order of `@source` definitions matter now, because you can
technically include or negate previous rules. This is similar to your
`.gitingore` file.
9. Rebuilds in watch mode now take the `@source` configuration into
account
   - Resolves #15684

## Combining with other features

Note that the `not` flag is also already compatible with [`@source
inline(…)`](https://github.com/tailwindlabs/tailwindcss/pull/17147)
added in an earlier commit:

```css
@import "tailwindcss";
@source not inline("container");
```

## Test plan

- We added a bunch of oxide unit tests to ensure that the right files
are scanned
- We updated the existing integration tests with new `@source not "…"`
specific examples and updated the existing tests to match the subtle
behavior changes
- We also added a new special tag `[ci-all]` that, when added to the
description of a PR, causes the PR to run unit and integration tests on
all operating systems.

[ci-all]

---------

Co-authored-by: Philipp Spiess <hello@philippspiess.com>
2025-03-25 15:54:41 +01:00

188 lines
3.8 KiB
Rust

use utf16::IndexConverter;
#[macro_use]
extern crate napi_derive;
mod utf16;
#[derive(Debug, Clone)]
#[napi(object)]
pub struct ChangedContent {
/// File path to the changed file
pub file: Option<String>,
/// Contents of the changed file
pub content: Option<String>,
/// File extension
pub extension: String,
}
#[derive(Debug, Clone)]
#[napi(object)]
pub struct GlobEntry {
/// Base path of the glob
pub base: String,
/// Glob pattern
pub pattern: String,
}
#[derive(Debug, Clone)]
#[napi(object)]
pub struct SourceEntry {
/// Base path of the glob
pub base: String,
/// Glob pattern
pub pattern: String,
/// Negated flag
pub negated: bool,
}
impl From<ChangedContent> for tailwindcss_oxide::ChangedContent {
fn from(changed_content: ChangedContent) -> Self {
if let Some(file) = changed_content.file {
return tailwindcss_oxide::ChangedContent::File(file.into(), changed_content.extension);
}
if let Some(contents) = changed_content.content {
return tailwindcss_oxide::ChangedContent::Content(contents, changed_content.extension);
}
unreachable!()
}
}
impl From<GlobEntry> for tailwindcss_oxide::GlobEntry {
fn from(glob: GlobEntry) -> Self {
Self {
base: glob.base,
pattern: glob.pattern,
}
}
}
impl From<tailwindcss_oxide::GlobEntry> for GlobEntry {
fn from(glob: tailwindcss_oxide::GlobEntry) -> Self {
Self {
base: glob.base,
pattern: glob.pattern,
}
}
}
impl From<SourceEntry> for tailwindcss_oxide::PublicSourceEntry {
fn from(source: SourceEntry) -> Self {
Self {
base: source.base,
pattern: source.pattern,
negated: source.negated,
}
}
}
// ---
#[derive(Debug, Clone)]
#[napi(object)]
pub struct ScannerOptions {
/// Glob sources
pub sources: Option<Vec<SourceEntry>>,
}
#[derive(Debug, Clone)]
#[napi]
pub struct Scanner {
scanner: tailwindcss_oxide::Scanner,
}
#[derive(Debug, Clone)]
#[napi(object)]
pub struct CandidateWithPosition {
/// The candidate string
pub candidate: String,
/// The position of the candidate inside the content file
pub position: i64,
}
#[napi]
impl Scanner {
#[napi(constructor)]
pub fn new(opts: ScannerOptions) -> Self {
Self {
scanner: tailwindcss_oxide::Scanner::new(match opts.sources {
Some(sources) => sources.into_iter().map(Into::into).collect(),
None => vec![],
}),
}
}
#[napi]
pub fn scan(&mut self) -> Vec<String> {
self.scanner.scan()
}
#[napi]
pub fn scan_files(&mut self, input: Vec<ChangedContent>) -> Vec<String> {
self
.scanner
.scan_content(input.into_iter().map(Into::into).collect())
}
#[napi]
pub fn get_candidates_with_positions(
&mut self,
input: ChangedContent,
) -> Vec<CandidateWithPosition> {
let content = input.content.unwrap_or_else(|| {
std::fs::read_to_string(input.file.unwrap()).expect("Failed to read file")
});
let input = ChangedContent {
file: None,
content: Some(content.clone()),
extension: input.extension,
};
let mut utf16_idx = IndexConverter::new(&content[..]);
self
.scanner
.get_candidates_with_positions(input.into())
.into_iter()
.map(|(candidate, position)| CandidateWithPosition {
candidate,
position: utf16_idx.get(position),
})
.collect()
}
#[napi(getter)]
pub fn files(&mut self) -> Vec<String> {
self.scanner.get_files()
}
#[napi(getter)]
pub fn globs(&mut self) -> Vec<GlobEntry> {
self
.scanner
.get_globs()
.into_iter()
.map(Into::into)
.collect()
}
#[napi(getter)]
pub fn normalized_sources(&mut self) -> Vec<GlobEntry> {
self
.scanner
.get_normalized_sources()
.into_iter()
.map(Into::into)
.collect()
}
}