Score:11

Errors attempting to run GPT detector under Ubuntu 22.04 WSL

vn flag

While the OpenAI detector has been useful in identifying content that is created by ChatGPT and other OpenAI-based models, as usage increases (especially by users here on Stack Exchange sites), it's been down more and more frequently.

After installing it locally per the project README, I receive the following error when attempting to run it from the repo directory using python -m detector.server ../gpt-2-models/detector-base.pt:

Traceback (most recent call last):
  File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/home/ntd/src/gpt-2-output-dataset/detector/server.py", line 120, in <module>
    fire.Fire(main)
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/home/ntd/src/gpt-2-output-dataset/detector/server.py", line 89, in main
    model.load_state_dict(data['model_state_dict'])
  File "/home/ntd/src/venv/openai-detector/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1671, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for RobertaForSequenceClassification:
        Missing key(s) in state_dict: "roberta.embeddings.position_ids".
        Unexpected key(s) in state_dict: "roberta.pooler.dense.weight", "roberta.pooler.dense.bias".

I attempted to change transformers==2.9.1 per comments in this issue, but then pip install -r requirements.txt fails as well.

Score:13
vn flag

Leaving this answer in place since it still contains useful information, but I've added an easier method in this answer. If you have Docker Desktop, you can also use this answer. Both of the newer answers have the advantage of strictly following the original HuggingSpace implementation, which this answer lacked.


The primary problem here seems to be resolved by using transformers==2.5.1 for me (as opposed to 2.9.1), but I also needed the Rust compiler (and build-essential) to build it. Most of this, at least starting with step 11, may be applicable to a non-WSL Ubuntu as well. However, there are also a few additional dependencies for CUDA (and I can't entirely be sure which, since I don't have a pure-Ubuntu GPU system on which to test).

Here are the complete steps I used to install on Ubuntu 22.04 on WSL. Note that you can simplify it quite a bit, by either not setting up a special distribution for the detector, not setting up a Python venv, or even skipping both. Honestly, doing both is overkill in terms of "isolation", but the steps are all here depending how you want to handle it:

  1. Registered a new Ubuntu 22.04 WSL distribution with ubuntu2204.exe from PowerShell. None previously existed, for reasons you'll see below.

  2. Added username and password when requested.

  3. Ran the normal, initial sudo apt update && sudo apt upgrade -y.

  4. Set the default username using /etc/wsl.conf per my answer here.

  5. Exited Ubuntu

  6. wsl --shutdown

  7. Created a directory for my "openai-detector" instance:

    mkdir D:\WSL\instances\openai-detector
    
  8. Copied the just-created Ubuntu 22.04 instance to a new distribution named openai-detector:

    wsl --import --vhd openai-detector D:\wsl\instances\openai-detector\ $env:localappdata\Packages\CanonicalGroupLimited.Ubuntu22.04LTS_79rhkp1fndgsc\LocalState\ext4.vhdx --version 2
    
  9. Removed the ubuntu-22.04 distribution since I can always create another one on demand when needed (as above). However, please only do this if you are sure that this is the one you just created and that there are no files that you need from it. This is an irreversible, destructible operation. I'm honestly a bit nervous every time I do it, since there's the chance I'll accidentally use the wrong distribution name. Just ... be careful:

    wsl --unregister ubuntu-22.04
    
  10. Started the new openai-detector distribution created above:

    wsl ~ -d openai-detector
    
  11. Installed rustup and build-essential:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    source "$HOME/.cargo/env"
    sudo apt install build-essential
    
  12. Set up virtual environment:

    sudo apt install python3-venv
    python3 -m venv ~/src/venv/openai-detector
    source ~/src/venv/openai-detector/bin/activate
    
  13. Clone the detector and download the model files:

    cd ~/src
    git clone https://github.com/openai/gpt-2-output-dataset.git
    mkdir gpt-2-models
    cd gpt-2-models
    wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-base.pt
    # and/or
    wget https://openaipublic.azureedge.net/gpt-2/detector-models/v1/detector-large.pt
    
  14. Modify the requirements to use Transformers 2.5.1:

    editor ~/src/gpt-2-output-dataset/requirements.txt
    

    Change the transformers line to:

    transformers==2.5.1
    
  15. Install requirements:

    pip install wheel
    cd ~/src/gpt-2-output-dataset
    pip install -r requirements.txt
    
  16. Run:

    python -m detector.server ../gpt-2-models/detector-base.pt
    

After initial installation, all that should be required in the future to start is:

wsl ~ -d openai-detector
cd ~/src/gpt-2-output-dataset
source ~/src/venv/openai-detector/bin/activate
python -m detector.server ../gpt-2-models/detector-base.pt

A local copy of OpenAI Detector should be running on localhost:8080.

Score:4
vn flag

And one more solution, in case you can't use my Docker-based answer for some reason. I believe this should replace my original solution as the preferred method.

Using the Dockerfile for the HuggingFace space as a guide, I've been able to reproduce this on a fresh Ubuntu 22.04. I'm running it on WSL, but thanks to @cocomac for confirming this also works on stock Debian. It should run on any recent Debian-based distribution:

sudo apt update && sudo apt upgrade -y

sudo apt install -y \
    git \
    make build-essential libssl-dev zlib1g-dev \
    libbz2-dev libreadline-dev libsqlite3-dev wget curl llvm \
    libncursesw5-dev xz-utils tk-dev libxml2-dev libxmlsec1-dev libffi-dev liblzma-dev git-lfs  \
    ffmpeg libsm6 libxext6 cmake libgl1-mesa-glx

git lfs install
cd ~
git clone https://huggingface.co/spaces/openai/openai-detector
cd openai-detector

PATH=/home/user/.local/bin:$PATH
curl https://pyenv.run | bash
source ~/.bashrc
PATH=$HOME/.pyenv/shims:$HOME/.pyenv/bin:$PATH

pyenv install 3.7.5
# ^ Takes a while

pyenv global 3.7.5
pyenv rehash

python --version
# Confirm 3.7.5

pip install --no-cache-dir --upgrade pip setuptools wheel
pip install --no-cache-dir \
    datasets \
    huggingface-hub "protobuf<4" "click<8.1"

pip install --no-cache-dir -r requirements.txt
# Ignore dependency errors, as it still appears to work

python -m detector.server detector-base.pt --port 7860

For future launches, add the following to something like a webui.sh script:

PATH=$HOME/.pyenv/shims:$HOME/.pyenv/bin:$PATH
pyenv global 3.7.5
pyenv rehash
cd ~/openai-detector
python -m detector.server detector-base.pt --port 7860
Score:1
vn flag

Leaving my other answer in place since it's useful to understand how to set things up "manually", but the recent addition of HuggingSpace's "Run with Docker" makes this far easier.

If you have Docker Desktop installed in Ubuntu (WSL or not), you can start up the OpenAI-Detector in a container:

  • First, make sure you have containerd enabled for Docker Desktop. This is currently a beta feature in Docker Desktop -- To enable, open the Docker Desktop Dashboard and go to Settings -> Features in development -> Beta features and select "Use containerd for pulling and storing images".

    Without this in place, I received an error when attempting to pull the Docker image from HuggingFace. Note that I have not found a way to enable this in Docker Engine at this point.

  • With that in place, I was able to run the OpenAI-Detector with:

    docker run -it --rm --gpus=all -p 7860:7860 --platform=linux/amd64 registry.hf.space/openai-openai-detector:latest
    

Notes:

  • This image consumes 5.84GB of disk space.
  • The image is, unfortunately, based on Ubuntu 18.04, which will soon reach EoSS on May 31.
  • --rm is optional, but recommended so that the container is removed after exiting.
  • --gpus=all is also optional since the image doesn't seem to require GPU support.
  • You can, of course, modify the port number being used by changing the first 7860 to the desired port (e.g. -p 80:7860).
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.