如何在 flask 后端处理录制的音频文件?

How to process an recorded audio file in flask backend?

提问人:Jayanta Basumatary 提问时间:11/14/2023 更新时间:11/14/2023 访问量:13

问:

我开发了一个 Flask 应用程序,我将使用 javascript 录制用户语音,并希望使用 Flask 后端处理音频。我正在使用 whisper openai 来转录录制的音频。我能够做到这一点,但问题是我必须事先保存录制的音频数据,而不是必须对其进行处理。我需要一种不同的方法,我可以直接转录音频语音,而无需将其保存在本地。野兔的代码


    function startRecording() {
      if (!isRecording) {
        navigator.mediaDevices
          .getUserMedia({ audio: true })
          .then(function (stream) {
            mediaRecorder = new MediaRecorder(stream);
            mediaRecorder.ondataavailable = function (event) {
              audioChunks.push(event.data);
            };
    
            mediaRecorder.onstop = function () {
              const audioBlob = new Blob(audioChunks, { type: "audio/wav" });
              // const audioUrl = URL.createObjectURL(audioBlob);
              audioChunks = [];
              // Play the recorded audio
              // const audioElement = document.getElementById("audio");
              // audioElement.src = audioUrl;
              // audioElement.play();
              sendAudioData(audioBlob, "translation1", "translation2");
              isRecording = false;
            };
    
            mediaRecorder.start();
            isRecording = true;
          })
          .catch(function (error) {
            console.error("Error accessing microphone:", error);
          });
      } else {
        mediaRecorder.stop();
      }
    }
    
    function sendAudioData(
      audioBlob,
      translationTextareaId,
      transcription2TextareaId
    ) {
      showLoadingAnimation();
      const formData = new FormData();
      formData.append("audio", audioBlob);
    
      fetch("/translate", {
        method: "POST",
        body: formData,
      }).then((response) => {
        // Handle the response from the Flask route
        response.json().then((data) => {
          const translation_in_english = data.translation1;
          const translation_in_assamese = data.translation2;
          // document.getElementById(translationTextareaId).textContent =
          //   translation_in_english;
          const translationTextarea = document.getElementById(
            translationTextareaId
          );
          hideLoadingAnimation();
          animateText(translationTextarea, translation_in_english);
          document.getElementById(transcription2TextareaId).textContent =
            translation_in_assamese;
        });
      });
    }

这是路线


    from flask import render_template, request, jsonify,Blueprint
    from app import app
    import uuid
    # _download(_MODELS["large-v2"], "/mnt/d/whisper-backend/models", False)
    
    UPLOADED_FOLDER = '/mnt/d/whisper-backend/recordings'
    MODEL_PATH_LARGE = '/mnt/d/whisper-backend/models/large-v2.pt'
    MODEL_PATH_TINY = '/mnt/d/whisper-backend/models/tiny.pt'
    
    # Determine the directory of your Flask app script
    current_directory = os.path.dirname(__file__)
    
    model = whisper.load_model('./models/tiny.pt')
    
    
    translate_route_blueprint = Blueprint("translate", __name__)
    
    @translate_route_blueprint.route('/translate', methods=['POST'])
    def translate():
        audio_file = request.files['audio']
        # Generate a unique filename using a UUID
        unique_filename = str(uuid.uuid4()) + '.wav'
        filepath = os.path.join(UPLOADED_FOLDER, unique_filename)
        audio_file.save(filepath)
        transcription = ""
        translation_1 = ""
        result = model.transcribe(filepath)
        transcription = result["text"]
        return jsonify({'translation1': transcription, 'translation2': translation_1})

我必须首先使用 uinque name unique_filename = str(uuid.uuid4()) + '.wav' 保存录制的文件,然后使用转录进行处理。请建议一种不同的方法来执行此操作而不保存它。

javascript flask python-requests 音频流 openai-whisper

评论


答: 暂无答案