So.. you are not actually looking to distinguish on "emoji or non-Latin characters", but instead the specifics of how‡ characters are transmitted on the wire?
I cannot think of a way to make Sieve go back to the raw bytes. You could work around by doing the matching in the mail server, e.g. using the Postfix (RFC2047-ignorant) header_checks feature to prepend a custom header, e.g.
# header_checks = pcre:/etc/postfix/maps/remember_header_encoding
# pcre is case insensitive by default
/^To:.*=\?utf-8\?B\?/ PREPEND X-Preserve-For-Sieve: RFC2047 marker in header To:
And then check for the existence of such marker headers in sieve.
Even if it was today, I doubt the whole thing will be reliable sorting criteria for the foreseeable future. A relaying SMTP server, up to and including the one passing to sieve might add encoding where there previously was none as part of message transformations. Some mail clients will add encoding where none is needed, others will fail to do so even though they should. Detecting a difference where none was intended is probably not going to statically affect the same sorts of messages.
‡ a choice other than superfluous encoding is rare with regular mail - Dovecot does not yet guarantee 8-bit-clean transports such as SMTPUTF8