lonestar-lemmy
  • Communities
  • Create Post
  • Create Community
  • heart
    Support Lemmy
  • search
    Search
  • Login
  • Sign Up
☆ Yσɠƚԋσʂ ☆@lemmy.ml to Open Source@lemmy.mlEnglish · 5 months ago

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

github.com

external-link
message-square
23
link
fedilink
  • cross-posted to:
  • opensource@programming.dev
127
external-link

Microsoft open-sourced a Python tool for converting files and office documents to Markdown

github.com

☆ Yσɠƚԋσʂ ☆@lemmy.ml to Open Source@lemmy.mlEnglish · 5 months ago
message-square
23
link
fedilink
  • cross-posted to:
  • opensource@programming.dev
GitHub - microsoft/markitdown: Python tool for converting files and office documents to Markdown.
github.com
external-link
Python tool for converting files and office documents to Markdown. - microsoft/markitdown
  • utopiah@lemmy.ml
    link
    fedilink
    arrow-up
    2
    ·
    5 months ago

    converting audio files to markdown must be a pretty recent feature

    Quite curious… does it actually do that and if so how? Because STT to get a plaintext file or subtitle (so with timing) has been available via e.g. Whisper quite efficiently for a while now. If this though does do more, e.g. structure (differentiating a title, list, etc) I’d like to learn how.

    • django@discuss.tchncs.de
      link
      fedilink
      arrow-up
      3
      ·
      5 months ago

      There is nothing special going on. This whole project is just a bunch of python libraries coupled together to a cli tool. It uses the package SpeechRecognition to connect to the google speech recognition api: https://github.com/microsoft/markitdown/blob/main/src/markitdown/_markitdown.py#L691

      Pretty uninteresting and a bit disappointing. Pandoc is a lot more interesting.

      • utopiah@lemmy.ml
        link
        fedilink
        arrow-up
        1
        ·
        5 months ago

        Thanks for the clarification. I checked the code you linked and noticed recognize_google and seems it’s relying on https://github.com/Uberi/speech_recognition which then seems to rely on https://github.com/Uberi/speech_recognition/blob/master/speech_recognition/recognizers/google.py so basically are they using an API, sending all the audio data to Google servers?

        • django@discuss.tchncs.de
          link
          fedilink
          English
          arrow-up
          1
          ·
          5 months ago

          Yes, this is how I read it as well. The library would support to use a local model, but they decided to just send the audio data to Google.

          • utopiah@lemmy.ml
            link
            fedilink
            arrow-up
            3
            ·
            5 months ago

            Might open up a GDPR related issue there. I don’t think people using such a library assume they need connectivity nor that their data would be send to a 3rd party.

Open Source@lemmy.ml

opensource@lemmy.ml

Subscribe from Remote Instance

Create a post
You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: !opensource@lemmy.ml

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

  • Open Source Initiative
  • Free Software Foundation
  • Electronic Frontier Foundation
  • Software Freedom Conservancy
  • It’s FOSS
  • Android FOSS Apps Megathread

Rules

  • Posts must be relevant to the open source ideology
  • No NSFW content
  • No hate speech, bigotry, etc

Related Communities

  • !libre_culture@lemmy.ml
  • !libre_software@lemmy.ml
  • !libre_hardware@lemmy.ml
  • !linux@lemmy.ml
  • !technology@lemmy.ml

Community icon from opensource.org, but we are not affiliated with them.

Visibility: Public
globe

This community can be federated to other instances and be posted/commented in by their users.

  • 52 users / day
  • 717 users / week
  • 3.11K users / month
  • 10.5K users / 6 months
  • 2 local subscribers
  • 36.8K subscribers
  • 2.16K Posts
  • 34.7K Comments
  • Modlog
  • mods:
  • Evan@lemmy.ml
  • kevincox@lemmy.ml
  • CrypticCoffee@lemmy.ml
  • Lettuce eat lettuce@lemmy.ml
  • BE: 0.19.11
  • Modlog
  • Instances
  • Docs
  • Code
  • join-lemmy.org