[ovs-dev] [PATCH v1] checkpatch: Detect "trojan source" attack

Gaëtan Rivet grive at u256.net
Mon Nov 15 09:18:52 UTC 2021



On Wed, Nov 10, 2021, at 15:31, Mike Pattrick wrote:
> On Wed, Nov 10, 2021 at 6:30 AM Gaëtan Rivet <grive at u256.net> wrote:
>>
>> On Tue, Nov 2, 2021, at 19:43, Mike Pattrick wrote:
>> > Recently there has been a lot of press about the "trojan source" attack,
>> > where Unicode characters are used to obfuscate the true functionality of
>> > code. This attack didn't effect OVS, but adding the check here will help
>> > guard against it sneaking in later.
>> >
>> > Signed-off-by: Mike Pattrick <mkp at redhat.com>
>>
>> Hi,
>>
>> What did you base the selection of characters to blacklist on?
>
> I believe this list was sourced from https://unicode.org/reports/tr9/
>

Sure, I'm just thinking about zero-width chars, that are used
to subtly introduce off-by-ones. The bidir check seems incomplete.

>> Reading issues open on other languages, I haven't found a good comprehensive
>> set of characters that would need to be blacklisted. I'm not sure it is a sufficient
>> approach: getting creative and circumventing this kind of blacklist would be a sport.
>>
>> Instead, shouldn't we take the reverse approach and whitelist single-byte chars?
>> (warn on multi-byte unicode sequence). It would be sufficient for the vast majority
>> of C sources (and scripts).
>
> I've been going back and forth on that idea. I'm afraid of making a
> change that seems exclusive to people with non-latin characters in
> their name. There are a few pre-canned lists of homoglyphs, maybe I
> could add those to the blacklist?
>

I understand, but the check should only execute on {.c,.h,.in} files, not on the
commit header itself.

If restricted to sources, I think no name would appear. Comments and doc are written
in English.

-- 
Gaetan Rivet


More information about the dev mailing list