Showing posts with label Database. Show all posts
Showing posts with label Database. Show all posts

Sunday, June 2, 2024

Find Special/Escape Characters in Sqlite

Find TAB

SELECT * FROM table WHERE column LIKE '%' || X'09' || '%';
SELECT * FROM table WHERE column LIKE '%' || CHAR(9) || '%';

Find CR LF

SELECT * FROM table WHERE column LIKE '%' || CHAR(13) || CHAR(10) || '%';
 

Wednesday, August 30, 2023

Python aiosqlite - async SQLITE

Install the aiosqlite package: aiosqlite is an async version of sqlite3 that works with Python's asyncio library. You can install it using pip:

 Import the necessary libraries: You need to import aiosqlite and asyncio for this task.

Create an async function to execute your query: This function will connect to a SQLite database, execute a query, and return the results. It uses the aiosqlite.connect() method to establish a connection to the database and the execute() method to run the SQL query.

Create tasks for each database: For each database you want to search, create a task using asyncio.create_task(). Each of these tasks will run the query_db function concurrently.

Execute the tasks: Use asyncio.gather() to run all the tasks concurrently and wait for all of them to finish. The results will be returned as a list of lists, with each sublist containing the results from one database.

Iterate through the results: You can now iterate through the results to process the data from each database.

import asyncio
import aiosqlite

async def query_db(database, query):
    async with aiosqlite.connect(database) as db:
        async with db.execute(query) as cursor:
            return await cursor.fetchall()

async def main():
    databases = ["database1.db", "database2.db", "database3.db"]
    query = "SELECT * FROM some_table"
    tasks = [asyncio.create_task(query_db(database, query)) for database in databases]

    results = await asyncio.gather(*tasks)

    for db_results in results:
        for row in db_results:
            # Process row
            pass

asyncio.run(main())

 

# This code will initiate connections to each of the specified databases concurrently and execute the same query on each. The results from each database are collected into a list of lists, which can then be processed as needed.Note: It's important to remember that although asyncio allows for concurrent execution of tasks, SQLite databases are stored on disk and therefore I/O bound. The actual speedup from using asyncio will depend on the speed of your disk and how large the databases are

Wednesday, August 2, 2023

Baza de date dexonline in format stardict (Dexonline Database in Stardict Format)

Acesta este un dictionar roman, bazat pe baza de date dexonline in format stardict. Formatul stardict este recunoscut de KoReader (singurul software open source pe care l-am gasit pentru e-reader-e).

Se descarcă baza de date de la dexonline si se transformă in format stardict.

TLDR

Ca sa obtineti un format stardict, trebuie mai intai obtinut un format gls din baza de date dexonline a datelor returnate de select-ul acesta:

select replace(l.form, "'", '') form, replace(d.htmlRep, "\n", "<br/>") html from Lexem l inner join EntryLexem el on l.id = el.lexemId
inner join Entry e on el.entryId = e.id
inner join EntryDefinition ed on e.id = ed.entryId
inner join Definition d on ed.definitionId = d.id where d.sourceId in (27, 21);

Formatul gls este destul de simplu. Pe prima linie se afla cuvantul cautat, apoi pe urmatoarea se afla definitia si dupa o linie libera. Deasupra datelor trebuie adaugat header-ul acesta (cu tot cu linii libere):


#bookname=dexonline.ro
#version=3.0.0
#stripmethod=keep
#sametypesequence=h
#date=2018-01-21
#description=Acesta este o adaptare a bazei de date DEX online sub licența GPL.

iar dupa aceea trebuie trecut prin stardict-editor pentru a obtine cele 3 fisiere stardict. Eu am folosit anumite software si anumite proceduri ca sa obtin formatul gls, dar evident, din moment ce se lucreaza cu soft-uri libere si formate text, puteti folosi orice software va e la indemana.

Cum sa-l folosesc

Cele 3 fisiere dex.* trebuie copiate in folderul de dictionare al koreader sau al oricarui alt cititor care cunoaste stardict. Daca doriti, puteti reincepe procesul pentru a obtine din nou baza de date si dictionarul.

Softuri folosite

Pasi

Instalati softurile necesare.

Atentie la stardict-tools. Pachetul este configurat sa depinda de libmysqlclient, care nu mai exista acum in Arch sau Artix Linux. Trebuie sa editati fisierul PKGBUILD si sa eliminati de acolo aceasta dependinta. Probabil libmysqlclient este folosit de alte softuri din pachet, stardict-editor (care ne trebuie noua) va merge.

Descarcati baza de date dexonline de

aici.   Instrucțiuni de instalare

Incarcati-o in mysql.

Un exemplu poate fi vazut aici

Folosind Sql Workbench/J, exportati datele in format CSV

In documentatia Sql Workbench/J sunt explicate anumite comenzi specifice, ca wbexport Le puteti gasi acolo. Eu am executat aceste comenzi:

wbexport -header=false -file=/tmp/rows.txt -delimiter=, -quoteChar="'" -quoteCharEscaping=none -quoteAlways=false -encoding=UTF-8;

select replace(l.form, "'", '') form, d.htmlRep html from Lexem l inner join EntryLexem el on l.id = el.lexemId
inner join Entry e on el.entryId = e.id
inner join EntryDefinition ed on e.id = ed.entryId
inner join Definition d on ed.definitionId = d.id

In urma acestei operatii, am obtinut datele in format text in fisierul /tmp/rows.txt.

Preparati fisierul text

Fisierul text obtinut trebuie transformat in format .gls. Puteti gasi informatii despre format aici.

Am editat fisierul cu vim si am facut urmatoarele schimbari:

  • :%s/\v'$//g (am eliminat ' de la sfarsitul fiecarei linii unde exista)
  • :%s/\v^([^,]+),'/\1, (am eliminat ' de dinaintea definitiilor)
  • :%s/\v^([^,]+),(.*)$/\1^M\2^M (am separat lexemele de definitii si am adaugat o linie noua dupa fiecare definitie)

Atentie la caracterele speciale: au fost obtinute apasand Ctrl si V si apoi Enter. Am inlocuit in principiu textele gasite cu linii noi.

Acum datele sunt in format .gls.

Creati fisierul dex.gls cu urmatorul continut:


#bookname=dexonline.ro
#version=3.0.0
#stripmethod=keep
#sametypesequence=h
#date=2018-01-21
#description=Acesta este o adaptare a bazei de date DEX online sub licența GPL.

-continut-rows.txt-modificat-

Atentie la liniile libere. Prima linie din fisier trebuie sa fie o linie libera si dupa definitia fisierului (linniile care incep cu #) trebuie inserat continutul modificat al rows.txt.

Salvati fisierul si inchideti vim.

Creati dictionarul

Deschideti stardict-editor. In interfata, selectati jos in stanga formatul Babylon file (nu Tab file, care este selectat initial). Apoi selectati fisierul creat dex.gls cu optiunea Browse in partea din stanga sus. Apoi apasati Compile in partea din dreapta jos. Daca totul a mers ok, ar trebui sa obtineti un mesaj de genul acesta:

Building...
Over
wordcount: &lt;n>
Done!

unde n este numarul de cuvinte din dictionar.

Acum ar trebui sa aveti langa fisierul dex.gls cele 3 fisiere stardict (dex.dict.dz, dex.idx, dex.ifo). Aceste 3 fisiere reprezinta dictionarul in format stardict. Trebuie sa le copiati in functie de e-reader-ul folosit in folderul indicat de documentatia KoReader

Sursa: https://github.com/cosminadrianpopescu/dexonline-stardict

Wednesday, June 14, 2023

Database Hosting

 Free backend hosting

 https://free-for.dev/#/?id=dbaas

Service Type Storage Limitations
Amazon DynamoDB ⚠️ Proprietary NoSQL 25 GB ⚠️ Payment method required
Amazon RDS ⚠️ Proprietary RDBMS
⚠️ Only free for 1 year
Azure SQL Database MS SQL Server
⚠️ Only free for 1 year
👉 Clever Cloud PostgreSQL, MySQL, MongoDB, Redis 256 MB (PostgreSQL) Max 5 connections (PostgreSQL)
ElephantSQL PostgreSQL 20 MB 5 concurrent connections
Fly.io PostgreSQL 3 GB ⚠️ Credit card required, limited outbound traffic, no way to avoid accidental overage fees
Google Cloud Firestore ⚠️ Proprietary NoSQL 1 GB ⚠️ After the first year there's no way to avoid accidental overage fees
Heroku Postgres (discontinued) 😭 PostgreSQL 10K max rows
IBM Cloud Cloudant ⚠️ Proprietary NoSQL 1 GB Deleted after 30 days of no activity
IBM Cloud Db2 Db2 200 MB ⚠️ "users are asked to re-extend their free account every 90 days by email. If you do not re-extend, your free account is cleaned out a further 90 days later"
MongoDB Atlas MongoDB 512 MB
OpenShift Developer Sandbox MariaDB, MongoDB, MySQL, PostgreSQL 15 GB ⚠️ Expires after 30 days (can resubscribe for free but not extend subscription), pods are automatically deleted after 12 consecutive hours of runtime
Oracle Cloud Oracle Database 20 GB each per two databases ⚠️ Payment method required
Redis Enterprise Redis 30 MB
Scalingo PostgreSQL 128 MB Max 10 connections
⚠️ Payment method required after 30 day trial
👉 Supabase PostgreSQL 500 MB Paused after 1 week inactivity
2 GB transfer limit    
  
 
https://railway.app/ provides free MySQL and PostgreSQL hosting.
Clever Cloud also allows free 500MB MongoDB -> https://www.clever-cloud.com/pricing/#MongoDB 
bio.io has offers three free PostgreSQL instances with 3GB data cap and moderate usage as well. https://bit.io/pricing 
https://remotemysql.com/ - Free 100MB MySQL
https://www.infinityfree.net/ - Free 400 MySQL Database (with PHP hosting, not remote)
https://freedb.tech/ - Free 50MB MySQL

 

Service Type RAM Storage Limitations
👉 Adaptable PaaS 256 MB Non-persistent? (1 GB database storage available)
AWS EC2 IaaS 1 GB
⚠️ Only free for 1 year
Azure App Service PaaS 1 GB 1 GB ⚠️ 60 CPU minutes/day
Azure VM IaaS 1 GB 2 GB (non-persistent) ⚠️ Only free for 1 year
👉 Fly.io PaaS 256 MB (2 instances) 1 GB (persistent)
  • 160 GB/month outbound traffic (broken down by region)
  • ❓ Trial plan (2 instances and 1 GB storage) doesn't require card, but does it expire?
  • ⚠️ Hobby plan (3 instances and 3 GB storage) requires credit card, no way to avoid accidental overage fees
Google App Engine PaaS 128 MB 1 GB
Google Compute Engine IaaS 0.6 GB 30 GB
  • 1 GB/month outbound traffic (not including China or Australia)
  • ⚠️ After the first year there's no way to avoid accidental overage fees
Heroku (discontinued) 😭 PaaS 512 MB Non-persistent Idling after 30 minutes inactivity, limited to 550 hours/month
IBM Cloud Foundry PaaS 256 MB
⚠️ Sleep after 10 days, deleted after 30 days of no "development activity"
👉 Northflank PaaS ??? (2 instances) Non-persistent ⚠️ Credit card required for signup, won't be billed on the free tier
OpenShift Developer Sandbox PaaS 7 GB 15 GB ⚠️ Expires after 30 days (can resubscribe for free but not extend subscription), pods are automatically deleted after 12 consecutive hours of runtime
Oracle Cloud IaaS
  • 1 GB each for 2 VMs (AMD CPU)
  • 24 GB across up to 4 VMs (ARM CPU)
200 GB across 2 volumes
  • AMD CPUs limited to 1/8 of 2 CPUs per VM
  • ⚠️ Payment method required
👉 Railway Paas 512 MB 1 GB
  • Limited to 500 hours/month
  • 100 GB/month outbound traffic
👉 Render PaaS 512 MB Non-persistent
  • No persistent storage
  • Idling after 15 minutes of inactivity
  • 750 hours runtime/month