Mastra 입력 프로세서

Input Processors를 사용하면 언어 모델로 전송되기 전에 메시지를 가로채고, 수정하고, 검증하거나 필터링할 수 있습니다. 이는 가드레일, 콘텐츠 조정, 메시지 변환 및 보안 제어를 구현하는 데 유용합니다.

프로세서는 대화 스레드의 메시지에서 작동합니다. 콘텐츠를 수정, 필터링 또는 검증할 수 있으며, 특정 조건이 충족되면 요청을 완전히 중단할 수도 있습니다.

내장 프로세서

Mastra는 일반적인 사용 사례를 위한 여러 내장 프로세서를 제공합니다.

`UnicodeNormalizer`

이 프로세서는 유니코드 텍스트를 정규화하여 일관된 형식을 보장하고 잠재적으로 문제가 되는 문자를 제거합니다.

import { Agent } from "@mastra/core/agent";
import { UnicodeNormalizer } from "@mastra/core/agent/input-processor/processors";
import { openai } from "@ai-sdk/openai";
 
const agent = new Agent({
  name: 'normalized-agent',
  instructions: 'You are a helpful assistant',
  model: openai("gpt-4o"),
  inputProcessors: [
    new UnicodeNormalizer({
      stripControlChars: true,
      collapseWhitespace: true,
    }),
  ],
});

사용 가능한 옵션:

stripControlChars: 제어 문자 제거 (기본값: false)
preserveEmojis: 이모지 유지 (기본값: true)
collapseWhitespace: 여러 공백/줄바꿈 축소 (기본값: true)
trim: 앞뒤 공백 제거 (기본값: true)

`ModerationInputProcessor`

이 프로세서는 LLM을 사용하여 여러 카테고리에서 부적절한 콘텐츠를 탐지하는 콘텐츠 조정 기능을 제공합니다.

import { ModerationInputProcessor } from "@mastra/core/agent/input-processor/processors";
 
const agent = new Agent({
  inputProcessors: [
    new ModerationInputProcessor({
      model: openai("gpt-4.1-nano"), // 빠르고 비용 효율적인 모델 사용
      threshold: 0.7, // 플래그를 지정하기 위한 신뢰도 임계값
      strategy: 'block', // 플래그가 지정된 콘텐츠 차단
      categories: ['hate', 'harassment', 'violence'], // 사용자 정의 카테고리
    }),
  ],
});

사용 가능한 옵션:

model: 조정 분석을 위한 언어 모델 (필수)
categories: 확인할 카테고리 배열 (기본값: ['hate','hate/threatening','harassment','harassment/threatening','self-harm','self-harm/intent','self-harm/instructions','sexual','sexual/minors','violence','violence/graphic'])
threshold: 플래그 지정을 위한 신뢰도 임계값 (0-1, 기본값: 0.5)
strategy: 콘텐츠가 플래그될 때의 작업 (기본값: 'block')
customInstructions: 조정 에이전트를 위한 사용자 정의 지침

사용 가능한 전략:

block: 오류와 함께 요청 거부 (기본값)
warn: 경고를 기록하지만 콘텐츠는 허용
filter: 플래그된 메시지를 제거하지만 처리는 계속
rewrite: 의도를 보존하면서 중화하려고 시도

`PromptInjectionDetector`

이 프로세서는 프롬프트 인젝션 공격, 탈옥 및 시스템 조작 시도를 탐지하고 방지합니다.

import { PromptInjectionDetector } from "@mastra/core/agent/input-processor/processors";
 
const agent = new Agent({
  inputProcessors: [
    new PromptInjectionDetector({
      model: openai("gpt-4.1-nano"),
      threshold: 0.8, // 거짓 양성을 줄이기 위한 높은 임계값
      strategy: 'rewrite', // 의도를 보존하면서 중화 시도
      detectionTypes: ['injection', 'jailbreak', 'system-override'],
    }),
  ],
});

사용 가능한 옵션:

model: 인젝션 탐지를 위한 언어 모델 (필수)
detectionTypes: 탐지할 인젝션 유형 배열 (기본값: ['injection', 'jailbreak', 'system-override'])
threshold: 플래그 지정을 위한 신뢰도 임계값 (0-1, 기본값: 0.7)
strategy: 인젝션이 탐지될 때의 작업 (기본값: 'block')
instructions: 에이전트를 위한 사용자 정의 탐지 지침
includeScores: 로그에 신뢰도 점수 포함 여부 (기본값: false)

사용 가능한 전략:

block: 요청 거부 (기본값)
warn: 경고를 기록하지만 허용
filter: 플래그된 메시지 제거
rewrite: 정당한 의도를 보존하면서 인젝션을 중화하려고 시도

`PIIDetector`

이 프로세서는 메시지에서 개인 식별 정보(PII)를 탐지하고 선택적으로 수정합니다.

import { PIIDetector } from "@mastra/core/agent/input-processor/processors";
 
const agent = new Agent({
  inputProcessors: [
    new PIIDetector({
      model: openai("gpt-4.1-nano"),
      threshold: 0.6,
      strategy: 'redact', // 탐지된 PII를 자동으로 수정
      detectionTypes: ['email', 'phone', 'credit-card', 'ssn', 'api-key', 'crypto-wallet', 'iban'],
      redactionMethod: 'mask', // 마스킹하면서 형식 보존
      preserveFormat: true, // 수정된 값의 원본 구조 유지
      includeDetections: true, // 규정 준수 감사를 위한 세부 정보 기록
    }),
  ],
});

사용 가능한 옵션:

model: PII 탐지를 위한 언어 모델 (필수)
detectionTypes: 탐지할 PII 유형 배열 (기본값: ['email', 'phone', 'credit-card', 'ssn', 'api-key', 'ip-address', 'name', 'address', 'date-of-birth', 'url', 'uuid', 'crypto-wallet', 'iban'])
threshold: 플래그 지정을 위한 신뢰도 임계값 (0-1, 기본값: 0.6)
strategy: PII가 탐지될 때의 작업 (기본값: 'block')
redactionMethod: PII를 수정하는 방법 ('mask', 'hash', 'remove', 'placeholder', 기본값: 'mask')
preserveFormat: 수정 중 PII 구조 유지 (기본값: true)
includeDetections: 규정 준수를 위한 탐지 세부 정보를 로그에 포함 (기본값: false)
instructions: 에이전트를 위한 사용자 정의 탐지 지침

사용 가능한 전략:

block: PII가 포함된 요청 거부 (기본값)
warn: 경고를 기록하지만 허용
filter: PII가 포함된 메시지 제거
redact: PII를 플레이스홀더 값으로 교체

`LanguageDetector`

이 프로세서는 들어오는 메시지의 언어를 탐지하고 자동으로 대상 언어로 번역할 수 있습니다.

import { LanguageDetector } from "@mastra/core/agent/input-processor/processors";
 
const agent = new Agent({
  inputProcessors: [
    new LanguageDetector({
      model: openai("gpt-4o-mini"),
      targetLanguages: ['English', 'en'], // 영어 콘텐츠 허용
      strategy: 'translate', // 영어가 아닌 콘텐츠 자동 번역
      threshold: 0.8, // 높은 신뢰도 임계값
    }),
  ],
});

사용 가능한 옵션:

model: 탐지 및 번역을 위한 언어 모델 (필수)
targetLanguages: 대상 언어 배열 (언어 이름 또는 ISO 코드)
threshold: 언어 탐지를 위한 신뢰도 임계값 (0-1, 기본값: 0.7)
strategy: 대상 언어가 아닌 언어가 탐지될 때의 작업 (기본값: 'detect')
preserveOriginal: 메타데이터에 원본 콘텐츠 유지 (기본값: true)
instructions: 에이전트를 위한 사용자 정의 탐지 지침

사용 가능한 전략:

detect: 언어만 탐지, 번역하지 않음 (기본값)
translate: 대상 언어로 자동 번역
block: 대상 언어가 아닌 콘텐츠 거부
warn: 경고를 기록하지만 콘텐츠 허용

여러 프로세서 적용

여러 프로세서를 연결할 수 있습니다. inputProcessors 배열에 나타나는 순서대로 순차적으로 실행됩니다. 한 프로세서의 출력이 다음 프로세서의 입력이 됩니다.

순서가 중요합니다! 일반적으로 텍스트 정규화를 먼저, 보안 검사를 다음에, 콘텐츠 수정을 마지막에 배치하는 것이 모범 사례입니다.

import { Agent } from "@mastra/core/agent";
import { 
  UnicodeNormalizer, 
  ModerationInputProcessor, 
  PromptInjectionDetector,
  PIIDetector 
} from "@mastra/core/agent/input-processor/processors";
 
const secureAgent = new Agent({
  inputProcessors: [
    // 1. 먼저 텍스트 정규화
    new UnicodeNormalizer({ stripControlChars: true }),
    // 2. 보안 위협 확인
    new PromptInjectionDetector({ model: openai("gpt-4.1-nano") }),
    // 3. 콘텐츠 조정
    new ModerationInputProcessor({ model: openai("gpt-4.1-nano") }),
    // 4. 마지막에 PII 처리
    new PIIDetector({ model: openai("gpt-4.1-nano"), strategy: 'redact' }),
  ],
});

커스텀 프로세서 생성

InputProcessor 인터페이스를 구현하여 커스텀 프로세서를 생성할 수 있습니다.

import type { InputProcessor, MastraMessageV2, TripWire } from "@mastra/core/agent";
 
class MessageLengthLimiter implements InputProcessor {
  readonly name = 'message-length-limiter';
  
  constructor(private maxLength: number = 1000) {}
 
  process({ messages, abort }: { 
    messages: MastraMessageV2[]; 
    abort: (reason?: string) => never 
  }): MastraMessageV2[] {
    // 전체 메시지 길이 확인
    try {
      const totalLength = messages.reduce((sum, msg) => {
        return sum + msg.content.parts
          .filter(part => part.type === 'text')
          .reduce((partSum, part) => partSum + (part as any).text.length, 0);
      }, 0);
      
      if (totalLength > this.maxLength) {
        abort(`메시지가 너무 깁니다: ${totalLength}자 (최대: ${this.maxLength})`); // TripWire 오류 발생
      }
    } catch (error) {
      if (error instanceof TripWire) {
        throw error; // tripwire 오류 재발생
      }
      throw new Error(`길이 검증 실패: ${error instanceof Error ? error.message : '알 수 없는 오류'}`); // 애플리케이션 레벨에서 표준 오류 발생
    }
    
    return messages;
  }
}
 
// 커스텀 프로세서 사용
const agent = new Agent({
  inputProcessors: [
    new MessageLengthLimiter(2000), // 2000자로 제한
  ],
});

커스텀 프로세서를 생성할 때:

항상 messages 배열을 반환해야 합니다 (잠재적으로 수정됨)
처리를 조기에 종료하려면 abort(reason)를 사용하세요. abort는 메시지 차단을 시뮬레이션하는 데 사용됩니다. abort로 발생한 오류는 TripWire의 인스턴스가 됩니다. 코드/애플리케이션 레벨 오류의 경우 표준 오류를 발생시키세요.
입력 메시지를 직접 변경하고, 메시지의 parts와 content를 모두 변경해야 합니다.
프로세서를 단일 책임에 집중시키세요
프로세서 내에서 에이전트를 사용하는 경우, 빠른 모델을 사용하고, 응답 크기를 최대한 제한하며 (모든 토큰이 응답을 기하급수적으로 느리게 함), 시스템 프롬프트를 최대한 간결하게 만드세요. 이들은 모두 지연 시간의 병목점입니다.

에이전트 메소드와의 통합

Input processors는 generate()와 stream() 메소드 모두와 함께 작동합니다. 에이전트가 응답을 생성하거나 스트리밍하기 시작하기 전에 전체 프로세서 파이프라인이 완료됩니다.

// 프로세서는 generate() 전에 실행됩니다
const result = await agent.generate('안녕하세요');
 
// 프로세서는 stream() 전에도 실행됩니다
const stream = await agent.stream('안녕하세요');
for await (const chunk of stream.textStream) {
  console.log(chunk);
}

어떤 프로세서가 abort()을 호출하면 요청이 즉시 종료되고 후속 프로세서는 실행되지 않습니다. 에이전트는 요청이 차단된 이유에 대한 세부 정보(result.tripwireReason)와 함께 200 응답을 반환합니다.

2025년 7월 26일 기준 번역
by dongne.lab@gmail.com

입력 프로세서(Input Processors)

내장 프로세서

UnicodeNormalizer

ModerationInputProcessor

PromptInjectionDetector

PIIDetector

LanguageDetector

여러 프로세서 적용

커스텀 프로세서 생성

에이전트 메소드와의 통합

`UnicodeNormalizer`

`ModerationInputProcessor`

`PromptInjectionDetector`

`PIIDetector`

`LanguageDetector`