Using semgrep¶
While taking a look at the changes required to implement a solution for the Add support to SAST for the --disable-nosem option issue, I noticed that --disable-nosem
and --sarif
seemed to interact in a curious way.
What I observed¶
When I passed --disable-nosem
to the semgrep
executable, the results seemed the same as if I had not passed --disable-nosem
.
I ultimately attributed this unexpected behavior to the upstream semgrep with "--sarif" "# nosemgrep" comments are ignored for python issue.
Do my changes work as expected if I test with a project that does not use Python?
I looked for more issues about nosemgrep
and --sarif
not quite working properly and I identified the nosemgrep comments in TypeScript seem to be ignored issue which led me to what I believe is a 🔑 key piece of information in the Include ignored findings in SARIF output using suppression syntax issue.
- When using
--sarif
(to get the output in SARIF format), findings that are to be ignored withnosemgrep
are included in the output with a note that they have been suppressed.
What effect should --disable-nosem
have when --sarif
is used?
Let's test with a simple project where we know nosem
works first. Let's make sure to use a language where GitLab SAST is using semgrep
as the Analyzer. It loks like JavaScript
is a good example.
semgrep scan --sarif --output semgrep-sarif.json
/usr/local/bin/semgrep -f /rules -o semgrep.sarif --sarif --no-rewrite-rule-ids --strict --disable-version-check --no-git-ignore --exclude spec --exclude test --exclude tests --exclude tmp --metrics on --max-memory 0 --verbose
With the items in qa/fixtures/js/default
(remote: git@gitlab.com:gitlab-org/security-products/analyzers/semgrep.git
), there are 3 findings.
Parsing the .sarif
file with jq '.runs[0].results' semgrep.sarif
lets me see the results. Let's mark the one on line 16 with nosem
and see what the .sarif
report says after that.
found 'nosem' comment, skipping rule 'eslint.detect-non-literal-regexp' on line 16
but also:
Ran 11 rules on 1 file: 3 findings.
- ❓ Does it make sense that there are 3 findings when we know that one of them is to be ignored?
Let's see if the .sarif
file says anything special about the finding on line 16.
Yes:
"properties": {},
"ruleId": "eslint.detect-non-literal-regexp",
"suppressions": [
{
"kind": "inSource"
}
]
},
Then let's see what happens when we pass --disable-nosem
:
semgrep scan --disable-nosem --sarif --output semgrep-sarif.json
/usr/local/bin/semgrep -f /rules -o semgrep.sarif --sarif --no-rewrite-rule-ids --strict --disable-version-check --no-git-ignore --exclude spec --exclude test --exclude tests --exclude tmp --metrics on --max-memory 0 --disable-nosem --verbose
We still see that there are 3 findings. There is still an inSource
suppression for the item on line 16.
In other words: it seems like the --disable-nosem
flag does not change the Scan Summary or the content of the suppressions
in the SARIF report.
Let's permit nosem
and lose the --sarif
output.
Now, only 2 findings are reported.
Testing¶
docker run --rm --mount type=bind,source="$(pwd)",target=/tmp/app -it registry.gitlab.com/bcarranza/catssharetheirrules:disablenosemv0 /bin/sh
What I confirmed¶
- By default,
--enable-nosem
is enabled.