Skip to main content

Questions tagged [text-processing]

Mechanizing the creation or manipulation of electronic text.

text-processing
0 votes
0 answers
21 views

DeepL API fails to translate from Turkish to Ukrainian/Russian

I am developing an audio translation application using Python and the DeepL API. The program successfully recognizes Turkish speech and transcribes it, but fails to translate the text to Ukrainian or ...
Illia's user avatar
  • 1
-2 votes
2 answers
101 views

Keep most common line from each set of duplicates of a column

I have a relatively complicated Bash problem. I have a two-column CSV file that contains duplicate values in the first column, as well as duplicates within those duplicate values (in the second column)...
Hashim Aziz's user avatar
  • 5,138
-2 votes
0 answers
27 views

How do I improve this bash function to accept paths with spaces in it? [duplicate]

I wrote this bash function that returns the full path of a given directory or file given in the first argument. If no argument is given, it returns the PWD. However, it does not work when the file ...
K.defaoite's user avatar
3 votes
1 answer
173 views

How to retrieve text from the current line at specified cursor position before and after up to specified boundary characters?

Examples of boundary characters can be: "", '', (), space, ^$ (start and end of the line if any other boundary characters are not specified explicitly). Boundary characters should be easily ...
Anton Samokat's user avatar
0 votes
2 answers
48 views

convert list of word in github actions into json array and use as a strategy matrix

I have in github actions a list of words with spaces in the shell Bash. e.g. hello1 hello2 hello3 The goal is to convert this list to a JSON array format, write it to the output variable and use it ...
user avatar
0 votes
1 answer
170 views

Combine text embeddings

What's the best way to combine text embeddings into one and then search in vector db? I'm trying to create a recommendation system, so when a user clicks on another category, I get the text embedding ...
Bill's user avatar
  • 15
1 vote
1 answer
58 views

A regex line to remove whitespaces unless within double quotes, taking into account escaped double quotes

I am parsing some game config files using Python and putting it all in dictionaries. It seemed to all work well until I encountered the following edge-case: random_owned_controlled_state = { ...
lrdewaal's user avatar
  • 180
0 votes
0 answers
16 views

How can I process raw text format job posts from LinkedIn or similar sites into a key-value format?

Ive collected some job posts from linked for research purpose. I would like to get specific datas from these job posts and save inside my sql database. So how can I process the job posts txt files and ...
Towhid.kahn's user avatar
1 vote
0 answers
68 views

How to Automatically Split Overlong Lines in FORTRAN 77 Files Using Python?

Following this question, I'm working with legacy FORTRAN 77 files and need to address an issue where some lines exceed the 72-character limit. I've already developed a Python script that identifies ...
Foad S. Farimani's user avatar
0 votes
0 answers
152 views

Creating a Python script to mask sensitive data in a flat log file

I want to mask all types of sensitive data (usernames, passwords, api keys, DB connection strings, endpoints, secrets, and even any custom variables containing secrets) present in a flat log file. ...
Navdeep Singh's user avatar
0 votes
1 answer
183 views

Why is my Python script not calling GPT-3.5-turbo API?

Situation My Python script compiles and runs successfully. It creates the output file (edited.txt), but doesn't write anything to the file. API dashboard shows no usage, so I'm guessing the script ...
Tre Scinta's user avatar
0 votes
1 answer
83 views

Return sentences from list of sentences using user specified keyword

I got a list of sentences (roughly 20000) stored in excel file named list.xlsx and sheet named Sentence under column name named Sentence. My intention is to get words from user and return those ...
Programmer_nltk's user avatar
1 vote
2 answers
52 views

Determining the type of error and its place in the text

I have an Excel file with thousands of rows containing information about the fulfilment of a contract. The data is loaded into the system using a template. But sometimes the template is filled out ...
Oleg Romanov's user avatar
0 votes
1 answer
55 views

Implementing Automatic Syllable Splitting and Coloring in a Flutter App

I'm currently working on an educational app in Flutter aimed at making reading easier for children. A key feature I want to add is automatic syllable splitting of texts. The idea is to have the app ...
Carolin Heimsoth's user avatar
1 vote
1 answer
185 views

how to use jq to print one-line per root-level object key?

I would like to compress the space for a json file by printing in compact mode (-c) but I want to add a new line after each root-level object. for example, for the below object { "a": { ...
FangQ's user avatar
  • 1,520

15 30 50 per page
1
2 3 4 5
132