Questions tagged [text-processing]
Mechanizing the creation or manipulation of electronic text.
text-processing
1,975
questions
0
votes
0
answers
21
views
DeepL API fails to translate from Turkish to Ukrainian/Russian
I am developing an audio translation application using Python and the DeepL API. The program successfully recognizes Turkish speech and transcribes it, but fails to translate the text to Ukrainian or ...
-2
votes
2
answers
101
views
Keep most common line from each set of duplicates of a column
I have a relatively complicated Bash problem. I have a two-column CSV file that contains duplicate values in the first column, as well as duplicates within those duplicate values (in the second column)...
-2
votes
0
answers
27
views
How do I improve this bash function to accept paths with spaces in it? [duplicate]
I wrote this bash function that returns the full path of a given directory or file given in the first argument. If no argument is given, it returns the PWD. However, it does not work when the file ...
3
votes
1
answer
173
views
How to retrieve text from the current line at specified cursor position before and after up to specified boundary characters?
Examples of boundary characters can be: "", '', (), space, ^$ (start and end of the line if any other boundary characters are not specified explicitly). Boundary characters should be easily ...
0
votes
2
answers
48
views
convert list of word in github actions into json array and use as a strategy matrix
I have in github actions a list of words with spaces in the shell Bash. e.g.
hello1 hello2 hello3
The goal is to convert this list to a JSON array format, write it to the output variable and use it ...
0
votes
1
answer
170
views
Combine text embeddings
What's the best way to combine text embeddings into one and then search in vector db?
I'm trying to create a recommendation system, so when a user clicks on another category, I get the text embedding ...
1
vote
1
answer
58
views
A regex line to remove whitespaces unless within double quotes, taking into account escaped double quotes
I am parsing some game config files using Python and putting it all in dictionaries. It seemed to all work well until I encountered the following edge-case:
random_owned_controlled_state = {
...
0
votes
0
answers
16
views
How can I process raw text format job posts from LinkedIn or similar sites into a key-value format?
Ive collected some job posts from linked for research purpose. I would like to get specific datas from these job posts and save inside my sql database. So how can I process the job posts txt files and ...
1
vote
0
answers
68
views
How to Automatically Split Overlong Lines in FORTRAN 77 Files Using Python?
Following this question, I'm working with legacy FORTRAN 77 files and need to address an issue where some lines exceed the 72-character limit. I've already developed a Python script that identifies ...
0
votes
0
answers
152
views
Creating a Python script to mask sensitive data in a flat log file
I want to mask all types of sensitive data (usernames, passwords, api keys, DB connection strings, endpoints, secrets, and even any custom variables containing secrets) present in a flat log file.
...
0
votes
1
answer
183
views
Why is my Python script not calling GPT-3.5-turbo API?
Situation
My Python script compiles and runs successfully. It creates the output file (edited.txt), but doesn't write anything to the file. API dashboard shows no usage, so I'm guessing the script ...
0
votes
1
answer
83
views
Return sentences from list of sentences using user specified keyword
I got a list of sentences (roughly 20000) stored in excel file named list.xlsx and sheet named Sentence under column name named Sentence.
My intention is to get words from user and return those ...
1
vote
2
answers
52
views
Determining the type of error and its place in the text
I have an Excel file with thousands of rows containing information about the fulfilment of a contract. The data is loaded into the system using a template. But sometimes the template is filled out ...
0
votes
1
answer
55
views
Implementing Automatic Syllable Splitting and Coloring in a Flutter App
I'm currently working on an educational app in Flutter aimed at making reading easier for children. A key feature I want to add is automatic syllable splitting of texts. The idea is to have the app ...
1
vote
1
answer
185
views
how to use jq to print one-line per root-level object key?
I would like to compress the space for a json file by printing in compact mode (-c) but I want to add a new line after each root-level object.
for example, for the below object
{
"a": {
...