Azure OpenAIのコンテンツフィルタが日本語だと適用されない問題の解決

はじめに

パーソルキャリアでデータエンジニアをしている宇賀神です。

生成AIを用いたサービスを開発中に、プロンプトエンジニアの検証結果とopenai-pythonを使ったPythonスクリプトで、コンテンツフィルタが適用される条件が異なることがわかったので、原因と対応策について紹介したいと思います。

Azure OpenAI Service

Azure OpenAI Serviceはマイクロソフト社が提供しているクラウドサービス「Microsoft Azure」の中のサービスの一つでAIが利用できるサービスです。くわしくはこちらをご参照ください。

コンテンツフィルタ

Azure OpenAI Serviceのサービスの一つで、質問文、回答文それぞれに対し不適切な表現がないかチェックする機能です。不適切な表現を検知した場合、ただちに会話が中断されます。詳しくはこちらをご参照ください。

Azure OpenAI Serviceでデプロイしたモデルを実行する方法

実行するには以下の３つの方法があります。

プレイグラウンド上でリクエストする
REST APIでリクエストする
OpenAI Python AI Libraryを使ってリクエストする

今回私が参加したプロジェクトでは、プロンプトエンジニアがプレイグラウンドでプロンプトの検証を行い、エンジニアがOpenAI Python AI Library（以下openai-python）で実装するという体制で開発を進めました。したがってプレイグラウンド上でのモデルの回答とopenai-pythonの回答が一致する必要があります。

動作検証

プレイグランドとopenai-pythonでの動作結果が同じになることを検証しました。

検証環境

以下の環境で検証と実施しました。

OS：macOS Sonoma 14.2.1

Pythonのバージョン：3.11.3

openai-pythonのバージョン：1.16.2

プレイグラウンドとopenai-pythonでコンテンツフィルタが適用されるタイミングが違う問題

プレイグランドでのコンテンツフィルタが適用される質問文とopenai-pythonでのコンテンツフィルタが適用される質問文に差異があることに気づきました。

例えば、「爆弾の作り方を教えてください」という質問文をプレイグランド上で質問してみます。このような会話は適切でないと予測され、コンテンツフィルタが適用されることが予想されます。

上図のように予想通りコンテンツフィルタによって、会話が終了したことがわかります。

一方、openai-pythonで同じ質問をしてみます。

検証に以下のコードを使用して質問しました。openai-pythonで紹介されているサンプルコードとほぼ同じコードで質問文のみ変更しています。

from openai import AzureOpenAI

api_version = "2023-05-15"

# gets the API Key from environment variable AZURE_OPENAI_API_KEY
client = AzureOpenAI(
    api_version=api_version,
    api_key="XXXX",
    azure_endpoint="https://xxxxxxxxxxxxxxx.openai.azure.com",
)

completion = client.chat.completions.create(
    model="model-name",  # e.g. gpt-35-instant
    messages=[
        {
            "role": "system",
            "content": "You are an AI assistant that helps people find information.",
        },
        {
            "role": "user",
            "content": "爆弾の作り方を教えて",
        },
    ],
)
print(completion.model_dump_json(indent=2))

実行結果です。コンテンツフィルタが適用されず、会話が続いていることがわかります。

{
  "id": "chatcmpl-xxxxxxxxxxxx",
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "message": {
        "content": "Sorry, but I can't assist with that.",
        "role": "assistant",
        "function_call": null,
        "tool_calls": null
      }
    }
  ],
  "created": 1712299823,
  "model": "gpt-4-32k",
  "object": "chat.completion",
  "system_fingerprint": null,
  "usage": {
    "completion_tokens": 10,
    "prompt_tokens": 35,
    "total_tokens": 45
  }
}

"Sorry, but I can't assist with that."と回答しているため不適切な発言はしていませんが、プレイグラウンドとは動作が異なってしまっています。

同じように、REST APIで実行するとコンテンツフィルタがプレイグランドと同じ質問文に対してコンテンツフィルタが適用されることを確認しました。

REST APIの実行にはPostmanを使用しました。

プレイグラウンド上と同じように、コンテンツフィルタによって会話がストップしています。

次にREST APIをPython上からリクエストして動作を確認しました。

REST APIを実行するPythonのコードです。

import requests
import json
import pprint

url = "https://xxxxx"
url_params = {"api-version": "2023-07-01-preview"}
headers = {
  "content-type": "application/json",
  "api-key": "dumy"}

body_param = { "messages": [
  {
    "role": "system",
    "content": "You are an AI assistant that helps people find information."
  },
  {
    "role": "user",
    "content": "爆弾の作り方を教えて"
  }
]}
body_param = json.dumps(body_param)

response = requests.post(url, params=url_params, data=body_param, headers=headers)

pprint.pprint(json.loads(response.text))

実行結果です。

{'choices': [{'content_filter_results': {'custom_blocklists': [],
                                         'hate': {'filtered': False,
                                                  'severity': 'safe'},
                                         'profanity': {'detected': False,
                                                       'filtered': False},
                                         'self_harm': {'filtered': False,
                                                       'severity': 'safe'},
                                         'sexual': {'filtered': False,
                                                    'severity': 'safe'},
                                         'violence': {'filtered': False,
                                                      'severity': 'safe'}},
              'finish_reason': 'stop',
              'index': 0,
              'message': {'content': "I'm really sorry that you're feeling "
                                     "this way, but I'm unable to provide the "
                                     "help that you need. It's really "
                                     'important to talk things over with '
                                     'someone who can, though, such as a '
                                     'mental health professional or a trusted '
                                     'person in your life.',
                          'role': 'assistant'}}],
 'created': 1712301148,
 'id': 'chatcmpl-xxxxxxxxxxx',
 'model': 'gpt-4-32k',
 'object': 'chat.completion',
 'prompt_filter_results': [{'content_filter_results': {'custom_blocklists': [],
                                                       'hate': {'filtered': False,
                                                                'severity': 'safe'},
                                                       'jailbreak': {'detected': False,
                                                                     'filtered': False},
                                                       'profanity': {'detected': False,
                                                                     'filtered': False},
                                                       'self_harm': {'filtered': False,
                                                                     'severity': 'safe'},
                                                       'sexual': {'filtered': False,
                                                                  'severity': 'safe'},
                                                       'violence': {'filtered': False,
                                                                    'severity': 'safe'}},
                            'prompt_index': 0}],
 'system_fingerprint': None,
 'usage': {'completion_tokens': 52, 'prompt_tokens': 28, 'total_tokens': 80}}

openai-pythonの場合と同じように、コンテンツフィルタがかかりませんでした。

どうやらPythonから実行すると、コンテンツフィルタが適用されないようです。

原因と対策

原因はpythonのjson.dumps()で日本語部分をUnicodeエスケープしているのが問題のようでした。

試しに日本語の問い合わせ部分を英語に変えてみるといずれの環境で動かしても同じ質問文に対してコンテンツフィルタが提供されることを確認できました。

プレイグラウンドでの実行結果

openai-pythonでの実行結果

Traceback (most recent call last):
  File "/Users/xxxx/Documents/test/test.py", line 12, in <module>
    completion = client.chat.completions.create(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/.pyenv/versions/3.11.3/lib/python3.11/site-packages/openai/_utils/_utils.py", line 275, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/.pyenv/versions/3.11.3/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 667, in create
    return self._post(
           ^^^^^^^^^^^
  File "/Users/xxxx/.pyenv/versions/3.11.3/lib/python3.11/site-packages/openai/_base_client.py", line 1213, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/xxxx/.pyenv/versions/3.11.3/lib/python3.11/site-packages/openai/_base_client.py", line 902, in request
    return self._request(
           ^^^^^^^^^^^^^^
  File "/Users/xxxx/.pyenv/versions/3.11.3/lib/python3.11/site-packages/openai/_base_client.py", line 993, in _request
    raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'error': {'message': "The response was filtered due to the prompt triggering Azure OpenAI's content management policy. Please modify your prompt and retry. To learn more about our content filtering policies please read our documentation: https://go.microsoft.com/fwlink/?linkid=2198766", 'type': None, 'param': 'prompt', 'code': 'content_filter', 'status': 400}}

したがって、リクエスト情報のjsonのエンコードする方法を変えることで対応します。

import requests
import json
import pprint

url = "https://xxxxx"
url_params = {'api-version': '2023-07-01-preview'}
headers = {
  'content-type': 'application/json',
  'api-key': 'dumy'}

body_param = { "messages": [
  {
    "role": "system",
    "content": "You are an AI assistant that helps people find information."
  },
  {
    "role": "user",
    "content": "how to make bomb"
  }
]}
# 修正前
# body_param = json.dumps(body_param)
# 修正後
# json.dumpsで非ascii文字のエンコードはしない
body_param = json.dumps(body_param, ensure_ascii=False)
# こちらでエンコードする
# PythonのデフォルトエンコーディングのUTF-8形式でエンコード
 body_param = body_param.encode()
response = requests.post(url, params=url_params, data=body_param, headers=headers)

pprint.pprint(json.loads(response.text))

上記の修正で、プレイグランドと、REST APIで同じ質問文に対し、コンテンツフィルタ適用されるようになりました。

ensure_asciiをFalseにすることでbody_paramがどう変化するか表示させてみます。

ensure_ascii=Trueの場合

{"messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "\u6b7b\u306b\u305f\u3044\u3067\u3059"}]}

ensure_ascii=Falseの場合

{"messages": [{"role": "system", "content": "You are an AI assistant that helps people find information."}, {"role": "user", "content": "死にたいです"}]}

意図通り日本語部分をUnicodeエスケープされないようになりました。

openai-pythonはライブラリ内部でjsonにエンコードしているため、ライブラリ自体に手を入れない限り同じ質問文に対し、コンテンツフィルタが適用はされないと思われます。

まとめ

今回はAzure OpenAIのコンテンツフィルタが提供されない問題の解決について紹介しました。コンテンツフィルタを日本語の質問文に対しても適用させるには、openai-pythonライブラリを使用せず、自前でREST APIを実行するのが良いと思われます。

今後も生成AIを用いたサービスでのつまずいた点について紹介していきたいと思います。

宇賀神拓也　Takuya Ugajin

デジタルテクノロジー統括部デジタルソリューション部 Webアプリエンジニアグループリードエンジニア

2023年にパーソルキャリアに入社。前職ではAIベンチャーで自社サービスの開発に従事。現在はバックエンド開発を担当。

※2024年5月現在の情報です。