Tuesday, April 16, 2019

Fraze de precauție P

P101
Dacă este necesară consultarea medicului, țineți la îndemână recipientul sau eticheta produsului.
P102
A nu se lăsa la îndemâna copiilor.
P103
Citiți eticheta înainte de utilizare.
P201
Procurați instrucțiuni speciale înainte de utilizare.
P202
A nu se manipula decât după ce au fost citite și înțelese toate măsurile de securitate.
P210
A se păstra departe de surse de căldură, suprafețe fierbinți, scântei, flăcări și alte surse de aprindere. Fumatul interzis.
P211
Nu pulverizați deasupra unei flăcări deschise sau unei alte surse de aprindere.
P220
A se păstra/depozita departe de îmbrăcăminte/ …/materiale combustibile.
P221
Luați toate măsurile de precauție pentru a evita amestecul cu combustibili…
P222
A nu se lăsa în contact cu aerul.
P223
A nu se lăsa în contact cu apa.
P230
A se păstra umezit cu…
P231
A se manipula sub un gaz inert.
P232
A se proteja de umiditate.
P233
Păstrați recipientul închis etanș.
P234
Păstrați numai în recipientul original.
P235
A se păstra la rece.
P240
Legătură la pământ/conexiune echipotențială cu recipientul și cu echipamentul de recepție.
P241
Utilizați echipamente electrice/de ventilare/de iluminat/…/ antideflagrante.
P242
Nu utilizați unelte care produc scântei.
P243
Luați măsuri de precauție împotriva descărcărilor electrostatice.
P244
Feriti valvele si racordurile de ulei si grăsime.
P250
A nu supune la abraziuni/șocuri/…/frecare.
P251
Nu perforați sau ardeți, chiar și după utilizare.
P260
Nu inspirați praful/fumul/gazul/ceața/vaporii/ spray-ul.
P261
Evitați să inspirați praful/fumul/gazul/ceața/ vaporii/spray-ul.
P262
Evitați orice contact cu ochii, pielea sau îmbrăcămintea.
P263
Evitați contactul în timpul sarcinii/alăptării.
P264
Spălați-vă … bine după utilizare.
P270
A nu mânca, bea sau fuma în timpul utilizării produsului.
P271
A se utiliza numai în aer liber sau în spații bine ventilate.
P272
Nu scoateți îmbrăcămintea de lucru contaminată în afara locului de muncă.
P273
Evitați dispersarea în mediu.
P280
Purtați mănuși de protecție/îmbrăcăminte de protecție/echipament de protecție a ochilor/ echipament de protecție a feței.
P282
Purtați mănuși izolante împotriva frigului/echipament de protecție a feței/ochilor.
P283
Purtați îmbrăcăminte rezistentă la foc/flacără/ ignifugă.
P284
[În cazul în care ventilarea este necorespunzătoare] purtați echipament de protecție respiratorie.P231 + P232
A se manipula sub un gaz inert. A se proteja de umiditate.
P235 + P410
A se păstra la rece. A se proteja de lumina solară.
P301
ÎN CAZ DE ÎNGHIȚIRE:
P302
ÎN CAZ DE CONTACT CU PIELEA:
P303
ÎN CAZ DE CONTACT CU PIELEA (sau părul):
P304
ÎN CAZ DE INHALARE:
P305
ÎN CAZ DE CONTACT CU OCHII:
P306
ÎN CAZ DE CONTACT CU ÎMBRĂCĂMINTEA:
P308
ÎN CAZ DE expunere sau de posibilă expunere:
P310
Sunați imediat la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…
P311
Sunați la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic…
P312
Sunați la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…/dacă nu vă simțiți bine.
P313
Consultați medicul.
P314
Consultați medicul, dacă nu vă simțiți bine.
P315
Consultați imediat medicul.
P320
Un tratament specific este urgent (a se vedea … de pe această etichetă).
P321
Tratament specific (a se vedea … de pe această etichetă).
P330
Clătiți gura.
P331
NU provocați voma.
P332
În caz de iritare a pielii:
P333
În caz de iritare a pielii sau de erupție cutanată:
P334
Introduceți în apă rece/acoperiți cu o compresă umedă.
P335
Îndepărtați particulele depuse pe piele.
P336
Dezghețați părțile degerate cu apă călduță. Nu frecați zona afectată.
P337
Dacă iritarea ochilor persistă:
P338
Scoateți lentilele de contact, dacă este cazul și dacă acest lucru se poate face cu ușurință. Continuați să clătiți.
P340
Transportați persoana la aer liber și mențineți-o într-o poziție confortabilă pentru respirație.
P342
În caz de simptome respiratorii:
P351
Clătiți cu atenție cu apă, timp de mai multe minute.
P352
Spălați cu multă apă/…
P353
Clătiți pielea cu apă/faceți duș.
P360
Clătiți imediat îmbrăcămintea contaminată și pielea cu multă apă, înainte de scoaterea îmbrăcămintei.
P361
Scoateți imediat toată îmbrăcămintea contaminată.
P362
Scoateți îmbrăcămintea contaminată.
P363
Spălați îmbracămintea contaminată, înainte de reutilizare.
P364
Și spălați înainte de reutilizare.
P370
În caz de incendiu:
P371
În caz de incendiu de proporții și de cantități mari de produs:
P372
Risc de explozie în caz de incendiu.
P373
NU încercați să stingeți incendiul atunci când focul a ajuns la explozivi.
P374
Stingeți incendiul de la o distanță rezonabilă, luând măsuri normale de precauție.
P375
Stingeți incendiul de la distanță din cauza pericolului de explozie.
P376
Opriți scurgerea, dacă acest lucru se poate face în siguranță.
P377
Incendiu cauzat de o scurgere de gaz: nu încercați să stingeți, decât dacă scurgerea poate fi oprită în siguranță.
P378
A se utiliza… pentru a stinge.
P380
Evacuați zona.
P381
Eliminați toate sursele de aprindere, dacă acest lucru se poate face în siguranță.
P390
Absorbiți scurgerile de produs, pentru a nu afecta materialele din apropiere.
P391
Colectați scurgerile de produs.
P301 + P310
ÎN CAZ DE ÎNGHIȚIRE: sunați imediat la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…
P301 + P312
ÎN CAZ DE ÎNGHIȚIRE: sunați la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…/dacă nu vă simțiți bine.
P301 + P330 + P331
ÎN CAZ DE ÎNGHIȚIRE: clătiți gura. NU provocați voma.
P302 + P334
ÎN CAZ DE CONTACT CU PIELEA: introduceți în apă rece/acoperiți cu o compresă umedă.
P302 + P352
ÎN CAZ DE CONTACT CU PIELEA: spălați cu multă apă/…
P303 + P361 + P353
ÎN CAZ DE CONTACT CU PIELEA (sau părul): scoateți imediat toată îmbrăcămintea contaminată. Clătiți pielea cu apă/faceți duș.
P304 + P340
ÎN CAZ DE INHALARE: transportați persoana la aer liber și mențineți-o într-o poziție confortabilă pentru respirație.
P305 + P351 + P338
ÎN CAZ DE CONTACT CU OCHII: clătiți cu atenție cu apă timp de mai multe minute. Scoateți lentilele de contact, dacă este cazul și dacă acest lucru se poate face cu ușurință. Continuați să clătiți.
P306 + P360
ÎN CAZ DE CONTACT CU ÎMBRĂCĂMINTEA: clătiți imediat îmbrăcămintea contaminată și pielea cu multă apă, înainte de scoaterea îmbrăcămintei.
P308 + P311
ÎN CAZ de expunere sau de posibilă expunere: sunați la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…
P308 + P313
ÎN CAZ DE expunere sau de posibilă expunere: consultați medicul.
P332 + P313
În caz de iritare a pielii: consultați medicul.
P333 + P313
În caz de iritare a pielii sau de erupție cutanată: consultați medicul.
P335 + P334
Îndepărtați particulele depuse pe piele. Introduceți în apă rece/acoperiți cu o compresă umedă.
P337 + P313
Dacă iritarea ochilor persistă: consultați medicul.
P342 + P311
În caz de simptome respiratorii: sunați la un CENTRU DE INFORMARE TOXICOLOGICĂ/un medic/…
P361 + P364
Scoateți imediat toată îmbrăcămintea contaminată și spalați-o înainte de reutilizare.
P362 + P364
Scoateți îmbrăcămintea contaminată și spalați-o înainte de reutilizare.
P370 + P376
În caz de incendiu: opriți scurgerea, dacă acest lucru se poate face în siguranță.
P370 + P378
În caz de incendiu: a se utiliza… pentru a stinge.
P370 + P380
În caz de incendiu: evacuați zona.
P370 + P380 + P375
În caz de incendiu: evacuați zona. Stingeți incendiul de la distanță din cauza pericolului de explozie.
P371 + P380 + P375
În caz de incendiu de proporții și de cantități mari de produs: evacuați zona. Stingeți incendiul de la distanță din cauza pericolului de explozie.
P401
A se depozita…
P402
A se depozita într-un loc uscat.
P403
A se depozita într-un spațiu bine ventilat.
P404
A se depozita într-un recipient închis.
P405
A se depozita sub cheie.
P406
Depozitați într-un recipient rezistent la coroziune/recipient din… cu dublură interioară rezistentă la coroziune.
P407
Păstrați un spațiu gol între stive/paleți.
P410
A se proteja de lumina solară.
P411
A se depozita la temperaturi care să nu depășească … o C/… o F.
P412
Nu expuneți la temperaturi care depășesc 50 o C/ 122 o F.
P413
Depozitați cantitățile în vrac mai mari de … kg/ … lbs la temperaturi care să nu depășească … o C/… o F.
P420
Depozitați departe de alte materiale.
P422
Depozitați conținutul sub …
P402 + P404
A se depozita într-un loc uscat, într-un recipient închis.
P403 + P233
A se depozita într-un spațiu bine ventilat. Păstrați recipientul închis etanș.
P403 + P235
A se depozita într-un spațiu bine ventilat. A se păstra la rece.
P410 + P403
A se proteja de lumina solară. A se depozita într-un spațiu bine ventilat.
P410 + P412
A se proteja de lumina solară. Nu expuneți la temperaturi care depășesc 50 o C/ 122 o F.
P411 + P235
A se depozita la temperaturi care să nu depășească … o C/… o F. A se păstra la rece.
P501
Aruncați conținutul/recipientul la …
P502
Adresați-vă producătorului pentru informații privind recuperarea/reciclarea
Source: https://www.msds-europe.com

Fraze de pericol H

Frazele de pericol și de precauție sunt codificate folosind un cod alfanumeric unic, care constă dintr-o literă și trei numere, după cum urmează:
 • litera „H” (pentru „fraza de pericol”) sau „P” (pentru „fraza de precauție„). Rețineți că frazele de pericol care sunt transmise de DSD și DPD, dar care nu sunt incluse în GHS sunt codificate ca „EUH”;
 • o cifră care desemnează tipul de pericol, de ex. „2” pentru pericolele fizice; și
 • două numere care corespund numerotării succesive a pericolelor, cum ar fi explozivitatea (codurile de la 200 la 210), inflamabilitatea (codurile de la 220 la 230) etc.
Etichetele dvs. trebuie să conţine și frazele de pericol relevante care descriu natura și gravitatea pericolelor substanței sau amestecului (articolul 21 din CLP).
Frazele de pericol relevante pentru fiecare clasificare specifică a pericolelor sunt stabilite în tabelele din părțile 2-5 din anexa I al regulamentului CLP. În cazul în care o clasificare a substanțelor este armonizată și inclusă în partea 3 a anexei VI al reg. CLP, pe etichetă trebuie să se utilizeze fraza de pericol corespunzătoară pentru această clasificare, împreună cu orice altă frază de pericol pentru o clasificare nearmonizată.
Anexa III din CLP enumeră formularea corectă a frazei de pericol așa trebuie să apară pe etichetă. Frazele de pericol ale unei limbi trebuie grupate împreună cu frazele de precauție de aceeași limbă pe etichetă.

Fraze de pericol H

H200
Exploziv instabil.
H201
Exploziv; pericol de explozie în masă.
H202
Exploziv; pericol grav de proiectare.
H203
Exploziv; pericol de incendiu, detonare sau proiectare.
H204
Pericol de incendiu sau de proiectare.
H205
Pericol de explozie în masă în caz de incendiu.
H220
Gaz extrem de inflamabil.
H221
Gaz inflamabil.
H222
Aerosol extrem de inflamabil.
H223
Aerosol inflamabil.
H224
Lichid și vapori extrem de inflamabili.
H225
Lichid și vapori foarte inflamabili.
H226
Lichid și vapori inflamabili.
H228
Solid inflamabil.
H229
Recipient sub presiune: Poate exploda daca este incalzit.
H230
Pericol de explozie, chiar si in absenta aerului.
H231
Pericol de explozie, chiar și în absența aerului la presiune și/sau temperatură ridicată.
H240
Pericol de explozie în caz de încălzire.
H241
Pericol de incendiu sau de explozie în caz de încălzire.
H242
Pericol de incendiu în caz de încălzire.
H250
Se aprinde spontan, în contact cu aerul.
H251
Se autoîncălzește, pericol de aprindere.
H252
Se autoîncălzește în cantități mari; pericol de aprindere.
H260
În contact cu apa degajă gaze inflamabile care se pot aprinde spontan.
H261
În contact cu apa degajă gaze inflamabile.
H270
Poate provoca sau agrava un incendiu; oxidant.
H271
Poate provoca un incendiu sau o explozie; oxidant puternic.
H272
Poate agrava un incendiu; oxidant.
H280
Conține un gaz sub presiune; pericol de explozie în caz de încălzire.
H281
Conține un gaz răcit; poate cauza arsuri sau leziuni criogenice.
H290
Poate fi corosiv pentru metale.
H300
Mortal în caz de înghițire.
H301
Toxic în caz de înghițire.
H302
Nociv în caz de înghițire.
H304
Poate fi mortal în caz de înghițire și de pătrundere în căile respiratorii.
H310
Mortal în contact cu pielea.
H311
Toxic în contact cu pielea.
H312
Nociv în contact cu pielea.
H314
Provoacă arsuri grave ale pielii și lezarea ochilor.
H315
Provoacă iritarea pielii.
H317
Poate provoca o reacție alergică a pielii.
H318
Provoacă leziuni oculare grave.
H319
Provoacă o iritare gravă a ochilor.
H330
Mortal în caz de inhalare.
H331
Toxic în caz de inhalare.
H332
Nociv în caz de inhalare.
H334
Poate provoca simptome de alergie sau astm sau dificultăți de respirație în caz de inhalare.
H335
Poate provoca iritarea căilor respiratorii.
H336
Poate provoca somnolență sau amețeală.
H340
Poate provoca anomalii genetice <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H341
Susceptibil de a provoca anomalii genetice < indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H350
Poate provoca cancer <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H350i
Poate provoca cancer prin inhalare.
H351
Susceptibil de a provoca cancer <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H360
Poate dăuna fertilității sau fătului <indicați efectul specific, dacă este cunoscut><indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H360F – Poate dăuna fertilității.
H360D – Poate dăuna fătului.
H360FD – Poate dăuna fertilității. Poate dăuna fătului.
H360Fd – Poate dăuna fertilității. Susceptibil de a dăuna fătului.
H360Df – Poate dăuna fătului. Susceptibil de a dăuna fertilității.
H361
Susceptibil de a dăuna fertilității sau fătului <indicați efectul specific, dacă este cunoscut><indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H361f – Susceptibil de a dăuna fertilității.
H361d – Susceptibil de a dăuna fătului.
H361fd – Susceptibil de a dăuna fertilității. Susceptibil de a dăuna fătului.
H362
Poate dăuna copiilor alăptați la sân.
H370
Provoacă leziuni ale organelor <sau indicați toate organele afectate, dacă sunt cunoscute> <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H371
Poate provoca leziuni ale organelor <sau indicați toate organele afectate, dacă sunt cunoscute> <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H372
Provoacă leziuni ale organelor <sau indicați toate organele afectate, dacă sunt cunoscute> în caz de expunere prelungită sau repetată <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H373
Poate provoca leziuni ale organelor <sau indicați toate organele afectate, dacă sunt cunoscute> în caz de expunere prelungită sau repetată <indicați calea de expunere, dacă există probe concludente că nicio altă cale de expunere nu provoacă acest pericol>.
H300 + H310
Mortal în caz de înghițire sau în contact cu pielea
H300 + H330
Mortal în caz de înghițire sau inhalare
H310 + H330
Mortal în contact cu pielea sau prin inhalare
H300 + H310 + H330
Mortal în caz de înghițire, în contact cu pielea sau prin inhalare
H301 + H311
Toxic în caz de înghițire sau în contact cu pielea
H301 + H331
Toxic în caz de înghițire sau prin inhalare
H311 + H331
Toxic în contact cu pielea sau prin inhalare
H301 + H311 + H331
Toxic în caz de înghițire, în contact cu pielea sau prin inhalare
H302 + H312
Nociv în caz de înghițire sau în contact cu pielea
H302 + H332
Nociv în caz de înghițire sau inhalare
H312 + H332
Nociv în contact cu pielea sau prin inhalare
H302 + H312 + H332
Nociv în caz de înghițire, în contact cu pielea sau prin inhalare
H400
Foarte toxic pentru mediul acvatic.
H410
Foarte toxic pentru mediul acvatic cu efecte pe termen lung.
H411
Toxic pentru mediul acvatic cu efecte pe termen lung.
H412
Nociv pentru mediul acvatic cu efecte pe termen lung.
H413
Poate provoca efecte nocive pe termen lung asupra mediului acvatic.
H420
Dăunează sănătății publice și mediului înconjurător prin distrugerea ozonului în atmosfera superioarăEUH 001
Exploziv în stare uscată.
EUH 014
Reacționează violent în contact cu apa.
EUH 018
În timpul utilizării poate forma un amestec vapori-aer, inflamabil/exploziv.
EUH 019
Poate forma peroxizi explozivi.
EUH 044
Risc de explozie, dacă este încălzit în spațiu închis.
EUH 029
În contact cu apa, degajă un gaz toxic.
EUH 031
În contact cu acizi, degajă un gaz toxic.
EUH 032
În contact cu acizi, degajă un gaz foarte toxic.
EUH 066
Expunerea repetată poate provoca uscarea sau crăparea pielii.
EUH 070
Toxic în caz de contact cu ochii.
EUH 071
Corosiv pentru căile respiratorii.
EUH 201/ 201A
Conține plumb. A nu se utiliza pe obiecte care pot fi mestecate sau supte de copii. Atenție! Conține plumb.
EUH 202
Cianoacrilat. Pericol. Se lipește de piele și ochi în câteva secunde. A nu se lăsa la îndemâna copiilor.
EUH 203
Conține crom (VI). Poate provoca o reacție alergică.
EUH 204
Conține izocianați. Poate provoca o reacție alergică.
EUH 205
Conține componenți epoxidici. Poate provoca o reacție alergică.
EUH 206
Atenție! A nu se folosi împreună cu alte produse. Poate elibera gaze periculoase (clor).
EUH 207
Atenție! Conține cadmiu. În timpul utilizării se degajă un fum periculos. A se vedea informațiile furnizate de producător. A se respecta instrucțiunile privind siguranța.
EUH 208
Conține <denumirea substanței sensibilizante>. Poate provoca o reacție alergică.
EUH 209/ 209A
Poate deveni foarte inflamabil în timpul utilizării. Poate deveni inflamabil în timpul utilizării.
EUH 210
Fișa cu date de securitate disponibilă la cerere.
EUH 401
Pentru a evita riscurile pentru sănătatea umană și mediu, a se respecta instrucțiunile de utilizare.
Source: https://www.msds-europe.com

Wednesday, November 28, 2018

Get Line Number in file with Python

lookup = 'text to find'

with open(filename) as myFile:
  for num, line in enumerate(myFile, 1):
    if lookup in line:
      print('found at line:', num)
 
Or:
 
f = open('some_file.txt','r')
line_num = 0
search_phrase = "the dog barked"
for line in f.readlines():
  line_num += 1
  if line.find(search_phrase) >= 0:
    print(line_num)
 
Or:
def line_num_for_phrase_in_file(phrase='the dog barked', filename='file.txt')
  with open(filename,'r') as f:
    for (i, line) in enumerate(f):
      if phrase in line:
        return i
  return -1

Or:
lookup="The_String_You're_Searching"
file_name = open("file.txt")
for num, line in enumerate(file_name,1):
    if lookup in line:
      print(num)
 
Or:
f_rd = open(path, 'r')
file_lines = f_rd.readlines()
f_rd.close()

matches = [line for line in file_lines if "chars of Interest" in line]
index = file_lines.index(matches[0]) 

Undocumented Features and Limitations of the Windows FINDSTR Command

Source: https://stackoverflow.com
The Windows FINDSTR command is horribly documented. There is very basic command line help available through FINDSTR /?, or HELP FINDSTR, but it is woefully inadequate. There is a wee bit more documentation online at https://docs.microsoft.com/en-us/windows-server/administration/windows-commands/findstr.
There are many FINDSTR features and limitations that are not even hinted at in the documentation. Nor could they be anticipated without prior knowledge and/or careful experimentation.
So the question is - What are the undocumented FINDSTR features and limitations?
The purpose of this question is to provide a one stop repository of the many undocumented features so that:
A) Developers can take full advantage of the features that are there.
B) Developers don't waste their time wondering why something doesn't work when it seems like it should.
Please make sure you know the existing documentation before responding. If the information is covered by the HELP, then it does not belong here.
Neither is this a place to show interesting uses of FINDSTR. If a logical person could anticipate the behavior of a particular usage of FINDSTR based on the documentation, then it does not belong here.
Along the same lines, if a logical person could anticipate the behavior of a particular usage based on information contained in any existing answers, then again, it does not belong here.
 
Preface
Much of the information in this answer has been gathered based on experiments run on a Vista machine. Unless explicitly stated otherwise, I have not confirmed whether the information applies to other Windows versions.
FINDSTR output
The documentation never bothers to explain the output of FINDSTR. It alludes to the fact that matching lines are printed, but nothing more.
The format of matching line output is as follows:
filename:lineNumber:lineOffset:text
where
fileName: = The name of the file containing the matching line. The file name is not printed if the request was explicitly for a single file, or if searching piped input or redirected input. When printed, the fileName will always include any path information provided. Additional path information will be added if the /S option is used. The printed path is always relative to the provided path, or relative to the current directory if none provided.
Note - The filename prefix can be avoided when searching multiple files by using the non-standard (and poorly documented) wildcards < and >. The exact rules for how these wildcards work can be found here. Finally, you can look at this example of how the non-standard wildcards work with FINDSTR.
lineNumber: = The line number of the matching line represented as a decimal value with 1 representing the 1st line of the input. Only printed if /N option is specified.
lineOffset: = The decimal byte offset of the start of the matching line, with 0 representing the 1st character of the 1st line. Only printed if /O option is specified. This is not the offset of the match within the line. It is the number of bytes from the beginning of the file to the beginning of the line.
text = The binary representation of the matching line, including any <CR> and/or <LF>. Nothing is left out of the binary output, such that this example that matches all lines will produce an exact binary copy of the original file.
FINDSTR "^" FILE >FILE_COPY
Most control characters and many extended ASCII characters display as dots on XP
FINDSTR on XP displays most non-printable control characters from matching lines as dots (periods) on the screen. The following control characters are exceptions; they display as themselves: 0x09 Tab, 0x0A LineFeed, 0x0B Vertical Tab, 0x0C Form Feed, 0x0D Carriage Return.
XP FINDSTR also converts a number of extended ASCII characters to dots as well. The extended ASCII characters that display as dots on XP are the same as those that are transformed when supplied on the command line. See the "Character limits for command line parameters - Extended ASCII transformation" section, later in this post
Control characters and extended ASCII are not converted to dots on XP if the output is piped, redirected to a file, or within a FOR IN() clause.
Vista and Windows 7 always display all characters as themselves, never as dots.
Return Codes (ERRORLEVEL)
 • 0 (success)
  • Match was found in at least one line of at least one file.
 • 1 (failure)
  • No match was found in any line of any file.
  • Invalid color specified by /A:xx option
 • 2 (error)
  • Incompatible options /L and /R both specified
  • Missing argument after /A:, /F:, /C:, /D:, or /G:
  • File specified by /F:file or /G:file not found
 • 255 (error)
Source of data to search (Updated based on tests with Windows 7)
Findstr can search data from only one of the following sources:
 • filenames specified as arguments and/or using the /F:file option.
 • stdin via redirection findstr "searchString" <file
 • data stream from a pipe type file | findstr "searchString"
Arguments/options take precedence over redirection, which takes precedence over piped data.
File name arguments and /F:file may be combined. Multiple file name arguments may be used. If multiple /F:file options are specified, then only the last one is used. Wild cards are allowed in filename arguments, but not within the file pointed to by /F:file.
Source of search strings (Updated based on tests with Windows 7)
The /G:file and /C:string options may be combined. Multiple /C:string options may be specified. If multiple /G:file options are specified, then only the last one is used. If either /G:file or /C:string is used, then all non-option arguments are assumed to be files to search. If neither /G:file nor /C:string is used, then the first non-option argument is treated as a space delimited list of search terms.
File names must not be quoted within the file when using the /F:FILE option.
File names may contain spaces and other special characters. Most commands require that such file names are quoted. But the FINDSTR /F:files.txt option requires that filenames within files.txt must NOT be quoted. The file will not be found if the name is quoted.
BUG - Short 8.3 filenames can break the /D and /S options
As with all Windows commands, FINDSTR will attempt to match both the long name and the short 8.3 name when looking for files to search. Assume the current folder contains the following non-empty files:
b1.txt
b.txt2
c.txt
The following command will successfully find all 3 files:
findstr /m "^" *.txt
b.txt2 matches because the corresponding short name B9F64~1.TXT matches. This is consistent with the behavior of all other Windows commands.
But a bug with the /D and /S options causes the following commands to only find b1.txt
findstr /m /d:. "^" *.txt
findstr /m /s "^" *.txt
The bug prevents b.txt2 from being found, as well as all file names that sort after b.txt2 within the same directory. Additional files that sort before, like a.txt, are found. Additional files that sort later, like d.txt, are missed once the bug has been triggered.
Each directory searched is treated independently. For example, the /S option would successfully begin searching in a child folder after failing to find files in the parent, but once the bug causes a short file name to be missed in the child, then all subsequent files in that child folder would also be missed.
The commands work bug free if the same file names are created on a machine that has NTFS 8.3 name generation disabled. Of course b.txt2 would not be found, but c.txt would be found properly.
Not all short names trigger the bug. All instances of bugged behavior I have seen involve an extension that is longer than 3 characters with a short 8.3 name that begins the same as a normal name that does not require an 8.3 name.
The bug has been confirmed on XP, Vista, and Windows 7.
Non-Printable characters and the /P option
The /P option causes FINDSTR to skip any file that contains any of the following decimal byte codes:
0-7, 14-25, 27-31.
Put another way, the /P option will only skip files that contain non-printable control characters. Control characters are codes less than or equal to 31 (0x1F). FINDSTR treats the following control characters as printable:
 8 0x08 backspace
 9 0x09 horizontal tab
10 0x0A line feed
11 0x0B vertical tab
12 0x0C form feed
13 0x0D carriage return
26 0x1A substitute (end of text)
All other control characters are treated as non-printable, the presence of which causes the /P option to skip the file.
Piped and Redirected input may have <CR><LF> appended
If the input is piped in and the last character of the stream is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. This has been confirmed on XP, Vista and Windows 7. (I used to think that the Windows pipe was responsible for modifying the input, but I have since discovered that FINDSTR is actually doing the modification.)
The same is true for redirected input on Vista. If the last character of a file used as redirected input is not <LF>, then FINDSTR will automatically append <CR><LF> to the input. However, XP and Windows 7 do not alter redirected input.
FINDSTR hangs on XP and Windows 7 if redirected input does not end with <LF>
This is a nasty "feature" on XP and Windows 7. If the last character of a file used as redirected input does not end with <LF>, then FINDSTR will hang indefinitely once it reaches the end of the redirected file.
Last line of Piped data may be ignored if it consists of a single character
If the input is piped in and the last line consists of a single character that is not followed by <LF>, then FINDSTR completely ignores the last line.
Example - The first command with a single character and no <LF> fails to match, but the second command with 2 characters works fine, as does the third command that has one character with terminating newline.
> set /p "=x" <nul | findstr "^"

> set /p "=xx" <nul | findstr "^"
xx

> echo x| findstr "^"
x
Reported by DosTips user Sponge Belly at new findstr bug. Confirmed on XP, Windows 7 and Windows 8. Haven't heard about Vista yet. (I no longer have Vista to test).
Option syntax
Options can be prefixed with either / or - Options may be concatenated after a single / or -. However, the concatenated option list may contain at most one multicharacter option such as OFF or F:, and the multi-character option must be the last option in the list.
The following are all equivalent ways of expressing a case insensitive regex search for any line that contains both "hello" and "goodbye" in any order
 • /i /r /c:"hello.*goodbye" /c:"goodbye.*hello"
 • -i -r -c:"hello.*goodbye" /c:"goodbye.*hello"
 • /irc:"hello.*goodbye" /c:"goodbye.*hello"
Search String length limits
On Vista the maximum allowed length for a single search string is 511 bytes. If any search string exceeds 511 then the result is a FINDSTR: Search string too long. error with ERRORLEVEL 2.
When doing a regular expression search, the maximum search string length is 254. A regular expression with length between 255 and 511 will result in a FINDSTR: Out of memory error with ERRORLEVEL 2. A regular expression length >511 results in the FINDSTR: Search string too long. error.
On Windows XP the search string length is apparently shorter. Findstr error: "Search string too long": How to extract and match substring in "for" loop? The XP limit is 127 bytes for both literal and regex searches.
Line Length limits
Files specified as a command line argument or via the /F:FILE option have no known line length limit. Searches were successfully run against a 128MB file that did not contain a single <LF>.
Piped data and Redirected input is limited to 8191 bytes per line. This limit is a "feature" of FINDSTR. It is not inherent to pipes or redirection. FINDSTR using redirected stdin or piped input will never match any line that is >=8k bytes. Lines >= 8k generate an error message to stderr, but ERRORLEVEL is still 0 if the search string is found in at least one line of at least one file.
Default type of search: Literal vs Regular Expression
/C:"string" - The default is /L literal. Explicitly combining the /L option with /C:"string" certainly works but is redundant.
"string argument" - The default depends on the content of the very first search string. (Remember that <space> is used to delimit search strings.) If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals. For example, "51.4 200" will be treated as two regular expressions because the first string contains an un-escaped dot, whereas "200 51.4" will be treated as two literals because the first string does not contain any meta-characters.
/G:file - The default depends on the content of the first non-empty line in the file. If the first search string is a valid regular expression that contains at least one un-escaped meta-character, then all search strings are treated as regular expressions. Otherwise all search strings are treated as literals.
Recommendation - Always explicitly specify /L literal option or /R regular expression option when using "string argument" or /G:file.
BUG - Specifying multiple literal search strings can give unreliable results
The following simple FINDSTR example fails to find a match, even though it should.
echo ffffaaa|findstr /l "ffffaaa faffaffddd"
This bug has been confirmed on Windows Server 2003, Windows XP, Vista, and Windows 7.
Based on experiments, FINDSTR may fail if all of the following conditions are met:
 • The search is using multiple literal search strings
 • The search strings are of different lengths
 • A short search string has some amount of overlap with a longer search string
 • The search is case sensitive (no /I option)
In every failure I have seen, it is always one of the shorter search strings that fails.
For more info see Why doesn't this FINDSTR example with multiple literal search strings find a match?
Quotes and backslahses within command line arguments - Note:
The information within this highlighted section is not 100% accurate. After I wrote this section, user MC ND pointed me to a reference that documents how the Microsoft C/C++ library parses parameters. It is horrifically complicated, but it appears to accurately predict the backslash and quote rules for FINDSTR command line arguments. I recommend you use the highlighted information below as a guide, but if you want more accurate info, refer to the link.

Escaping Quote within command line search strings
Quotes within command line search strings must be escaped with backslash like \". This is true for both literal and regex search strings. This information has been confirmed on XP, Vista, and Windows 7.
Note: The quote may also need to be escaped for the CMD.EXE parser, but this has nothing to do with FINDSTR. For example, to search for a single quote you could use:
FINDSTR \^" file && echo found || echo not found
Escaping Backslash within command line literal search strings
Backslash in a literal search string can normally be represented as \ or as \\. They are typically equivalent. (There may be unusual cases in Vista where the backslash must always be escaped, but I no longer have a Vista machine to test).
But there are some special cases:
When searching for consecutive backslashes, all but the last must be escaped. The last backslash may optionally be escaped.
 • \\ can be coded as \\\ or \\\\
 • \\\ can be coded as \\\\\ or \\\\\\
Searching for one or more backslashes before a quote is bizarre. Logic would suggest that the quote must be escaped, and each of the leading backslashes would need to be escaped, but this does not work! Instead, each of the leading backslashes must be double escaped, and the quote is escaped normally:
 • \" must be coded as \\\\\"
 • \\" must be coded as \\\\\\\\\"
As previously noted, one or more escaped quotes may also require escaping with ^ for the CMD parser
The info in this section has been confirmed on XP and Windows 7.
Escaping Backslash within command line regex search strings
 • Vista only: Backslash in a regex must be either double escaped like \\\\, or else single escaped within a character class set like [\\]
 • XP and Windows 7: Backslash in a regex can always be represented as [\\]. It can normally be represented as \\. But this never works if the backslash precedes an escaped quote.
  One or more backslashes before an escaped quote must either be double escaped, or else coded as [\\]
  • \" may be coded as \\\\\" or [\\]\"
  • \\" may be coded as \\\\\\\\\" or [\\][\\]\" or \\[\\]\"
Escaping Quote and Backslash within /G:FILE literal search strings
Standalone quotes and backslashes within a literal search string file specified by /G:file need not be escaped, but they can be.
" and \" are equivalent.
\ and \\ are equivalent.
If the intent is to find \\, then at least the leading backslash must be escaped. Both \\\ and \\\\ work.
If the intent is to find \", then at least the leading backslash must be escaped. Both \\" and \\\" work.
Escaping Quote and Backslash within /G:FILE regex search strings
This is the one case where the escape sequences work as expected based on the documentation. Quote is not a regex metacharacter, so it need not be escaped (but can be). Backslash is a regex metacharacter, so it must be escaped.
Character limits for command line parameters - Extended ASCII transformation
The null character (0x00) cannot appear in any string on the command line. Any other single byte character can appear in the string (0x01 - 0xFF). However, FINDSTR converts many extended ASCII characters it finds within command line parameters into other characters. This has a major impact in two ways:
1) Many extended ASCII characters will not match themselves if used as a search string on the command line. This limitation is the same for literal and regex searches. If a search string must contain extended ASCII, then the /G:FILE option should be used instead.
2) FINDSTR may fail to find a file if the name contains extended ASCII characters and the file name is specified on the command line. If a file to be searched contains extended ASCII in the name, then the /F:FILE option should be used instead.
Here is a complete list of extended ASCII character transformations that FINDSTR performs on command line strings. Each character is represented as the decimal byte code value. The first code represents the character as supplied on the command line, and the second code represents the character it is transformed into. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.
158 treated as 080   199 treated as 221   226 treated as 071
169 treated as 170   200 treated as 043   227 treated as 112
176 treated as 221   201 treated as 043   228 treated as 083
177 treated as 221   202 treated as 045   229 treated as 115
178 treated as 221   203 treated as 045   231 treated as 116
179 treated as 221   204 treated as 221   232 treated as 070
180 treated as 221   205 treated as 045   233 treated as 084
181 treated as 221   206 treated as 043   234 treated as 079
182 treated as 221   207 treated as 045   235 treated as 100
183 treated as 043   208 treated as 045   236 treated as 056
184 treated as 043   209 treated as 045   237 treated as 102
185 treated as 221   210 treated as 045   238 treated as 101
186 treated as 221   211 treated as 043   239 treated as 110
187 treated as 043   212 treated as 043   240 treated as 061
188 treated as 043   213 treated as 043   242 treated as 061
189 treated as 043   214 treated as 043   243 treated as 061
190 treated as 043   215 treated as 043   244 treated as 040
191 treated as 043   216 treated as 043   245 treated as 041
192 treated as 043   217 treated as 043   247 treated as 126
193 treated as 045   218 treated as 043   249 treated as 250
194 treated as 045   219 treated as 221   251 treated as 118
195 treated as 043   220 treated as 095   252 treated as 110
196 treated as 045   222 treated as 221   254 treated as 221
197 treated as 043   223 treated as 095
198 treated as 221   224 treated as 097
Any character >0 not in the list above is treated as itself, including <CR> and <LF>. The easiest way to include odd characters like <CR> and <LF> is to get them into an environment variable and use delayed expansion within the command line argument.
Character limits for strings found in files specified by /G:FILE and /F:FILE options
The nul (0x00) character can appear in the file, but it functions like the C string terminator. Any characters after a nul character are treated as a different string as if they were on another line.
The <CR> and <LF> characters are treated as line terminators that terminate a string, and are not included in the string.
All other single byte characters are included perfectly within a string.
Searching Unicode files
FINDSTR cannot properly search most Unicode (UTF-16, UTF-16LE, UTF-16BE, UTF-32) because it cannot search for nul bytes and Unicode typically contains many nul bytes.
However, the TYPE command converts UTF-16LE with BOM to a single byte character set, so a command like the following will work with UTF-16LE with BOM.
type unicode.txt|findstr "search"
Note that Unicode code points that are not supported by your active code page will be converted to ? characters.
It is possible to search UTF-8 as long as your search string contains only ASCII. However, the console output of any multi-byte UTF-8 characters will not be correct. But if you redirect the output to a file, then the result will be correctly encoded UTF-8. Note that if the UTF-8 file contains a BOM, then the BOM will be considered as part of the first line, which could throw off a search that matches the beginning of a line.
It is possible to search multi-byte UTF-8 characters if you put your search string in a UTF-8 encoded search file (without BOM), and use the /G option.
End Of Line
FINDSTR breaks lines immediately after every <LF>. The presence or absence of <CR> has no impact on line breaks.
Searching across line breaks
As expected, the . regex metacharacter will not match <CR> or <LF>. But it is possible to search across a line break using a command line search string. Both the <CR> and <LF> characters must be matched explicitly. If a multi-line match is found, only the 1st line of the match is printed. FINDSTR then doubles back to the 2nd line in the source and begins the search all over again - sort of a "look ahead" type feature.
Assume TEXT.TXT has these contents (could be Unix or Windows style)
A
A
A
B
A
A
Then this script
@echo off
setlocal
::Define LF variable containing a linefeed (0x0A)
set LF=^


::Above 2 blank lines are critical - do not remove

::Define CR variable containing a carriage return (0x0D)
for /f %%a in ('copy /Z "%~dpf0" nul') do set "CR=%%a"

setlocal enableDelayedExpansion
::regex "!CR!*!LF!" will match both Unix and Windows style End-Of-Line
findstr /n /r /c:"A!CR!*!LF!A" TEST.TXT
gives these results
1:A
2:A
5:A
Searching across line breaks using the /G:FILE option is imprecise because the only way to match <CR> or <LF> is via a regex character class range expression that sandwiches the EOL characters.
 • [<TAB>-<0x0B>] matches <LF>, but it also matches <TAB> and <0x0B>
 • [<0x0C>-!] matches <CR>, but it also matches <0x0C> and !
  Note - the above are symbolic representations of the regex byte stream since I can't graphically represent the characters.

  Limited Regular Expressions (regex) Support
  FINDSTR support for regular expressions is extremely limited. If it is not in the HELP documentation, it is not supported.
  Beyond that, the regex expressions that are supported are implemented in a completely non-standard manner, such that results can be different then would be expected coming from something like grep or perl.
  Regex Line Position anchors ^ and $
  ^ matches beginning of input stream as well as any position immediately following a <LF>. Since FINDSTR also breaks lines after <LF>, a simple regex of "^" will always match all lines within a file, even a binary file.
  $ matches any position immediately preceding a <CR>. This means that a regex search string containing $ will never match any lines within a Unix style text file, nor will it match the last line of a Windows text file if it is missing the EOL marker of <CR><LF>.
  Note - As previously discussed, piped and redirected input to FINDSTR may have <CR><LF> appended that is not in the source. Obviously this can impact a regex search that uses $.
  Any search string with characters before ^ or after $ will always fail to find a match.
  Positional Options /B /E /X
  The positional options work the same as ^ and $, except they also work for literal search strings.
  /B functions the same as ^ at the start of a regex search string.
  /E functions the same as $ at the end of a regex search string.
  /X functions the same as having both ^ at the beginning and $ at the end of a regex search string.
  Regex word boundary
  \< must be the very first term in the regex. The regex will not match anything if any other characters precede it. \< corresponds to either the very beginning of the input, the beginning of a line (the position immediately following a <LF>), or the position immediately following any "non-word" character. The next character need not be a "word" character.
  \> must be the very last term in the regex. The regex will not match anything if any other characters follow it. \> corresponds to either the end of input, the position immediately prior to a <CR>, or the position immediately preceding any "non-word" character. The preceding character need not be a "word" character.
  Here is a complete list of "non-word" characters, represented as the decimal byte code. Note - this list was compiled on a U.S machine. I do not know what impact other languages may have on this list.
  001  028  063  179  204  230
  002  029  064  180  205  231
  003  030  091  181  206  232
  004  031  092  182  207  233
  005  032  093  183  208  234
  006  033  094  184  209  235
  007  034  096  185  210  236
  008  035  123  186  211  237
  009  036  124  187  212  238
  011  037  125  188  213  239
  012  038  126  189  214  240
  014  039  127  190  215  241
  015  040  155  191  216  242
  016  041  156  192  217  243
  017  042  157  193  218  244
  018  043  158  194  219  245
  019  044  168  195  220  246
  020  045  169  196  221  247
  021  046  170  197  222  248
  022  047  173  198  223  249
  023  058  174  199  224  250
  024  059  175  200  226  251
  025  060  176  201  227  254
  026  061  177  202  228  255
  027  062  178  203  229
  
  Regex character class ranges [x-y]
  Character class ranges do not work as expected. See this question: Why does findstr not handle case properly (in some circumstances)?, along with this answer: https://stackoverflow.com/a/8767815/1012053.
  The problem is FINDSTR does not collate the characters by their byte code value (commonly thought of as the ASCII code, but ASCII is only defined from 0x00 - 0x7F). Most regex implementations would treat [A-Z] as all upper case English capital letters. But FINDSTR uses a collation sequence that roughly corresponds to how SORT works. So [A-Z] includes the complete English alphabet, both upper and lower case (except for "a"), as well as non-English alpha characters with diacriticals.
 • Regex character class term limit and BUG
  Not only is FINDSTR limited to a maximum of 15 character class terms within a regex, it fails to properly handle an attempt to exceed the limit. Using 16 or more character class terms results in an interactive Windows pop up stating "Find String (QGREP) Utility has encountered a problem and needs to close. We are sorry for the inconvenience." The message text varies slightly depending on the Windows version. Here is one example of a FINDSTR that will fail:
  echo 01234567890123456|findstr [0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]
  
  This bug was reported by DosTips user Judago here. It has been confirmed on XP, Vista, and Windows 7.
  Regex searches fail (and may hang indefinitely) if they include byte code 0xFF (decimal 255)
  Any regex search that includes byte code 0xFF (decimal 255) will fail. It fails if byte code 0xFF is included directly, or if it is implicitly included within a character class range. Remember that FINDSTR character class ranges do not collate characters based on the byte code value. Character <0xFF> appears relatively early in the collation sequence between the <space> and <tab> characters. So any character class range that includes both <space> and <tab> will fail.
  The exact behavior changes slightly depending on the Windows version. Windows 7 hangs indefinitely if 0xFF is included. XP doesn't hang, but it always fails to find a match, and occasionally prints the following error message - "The process tried to write to a nonexistent pipe."
  I no longer have access to a Vista machine, so I haven't been able to test on Vista.
  Regex bug: . and [^anySet] can match End-Of-File
  The regex . meta-character should only match any character other than <CR> or <LF>. There is a bug that allows it to match the End-Of-File if the last line in the file is not terminated by <CR> or <LF>. However, the . will not match an empty file.
  For example, a file named "test.txt" containing a single line of x, without terminating <CR> or <LF>, will match the following:
  findstr /r x......... test.txt
  
  This bug has been confirmed on XP and Win7.
  The same seems to be true for negative character sets. Something like [^abc] will match End-Of-File. Positive character sets like [abc] seem to work fine. I have only tested this on Win7.

Another grep.py

Grep.py - Source: https://github.com/rohitkrai03

import sys
import re
import os
import utils

def main():
files = utils.troll_directories(os.path.normpath(sys.argv[1]))
patterns = utils.convert_patterns(sys.argv[2:])
utils.apply_patterns(files, patterns)


if __name__ == '__main__' : main()

Utils.py:

import re
import os

def convert_patterns(patterns):
results = []

# for each pattern
for pattern in patterns:
# make a regular expression with it
expr = re.compile(pattern)
results.append(expr)
# return the results
return results

def troll_directories(start):
# troll for all the directories like in find
results = []
# Traverse the directory for all the files.
for root, dirs, files in os.walk(start):
for fname in files:
# put the full path into the results
results.append(os.path.join(root, fname))
return results

def apply_patterns(files, patterns):
# for each file in files
for fname in files:
# open the file and read the lines
lines = open(fname).readlines()
for num, line in enumerate(lines):
# for each pattern
for pattern in patterns:
# if pattern found in contents
if pattern.search(lines):
# print file, line number, line
print("{}:{}: {}".format(os.path.join(fname), num+1, line))

Python Find

It is a python based small utility which finds for given regular expression in all the filenames for the given directory and returns them with full path

How to Use

Just download the package or clone the repo from github.
Run the find.py file with source directory and regular expression to search for given as command line argument.

Example : - py find.py '.' '.*.py'


Find,py
#!/usr/bin/env python3
import sys
import re
import os
# Get the start directory.
start = os.path.normpath(sys.argv[1])
# Get the patterns from the command line arguments.
pattern = sys.argv[2]
# Convert them to regular expressions.
expr = re.compile(pattern)
# Traverse the directory for all the files.
for root, dirs, files in os.walk(start):
for fname in files:
# If a file matches the pattern then print its name.
if expr.search(fname):
print(os.path.join(root, fname))Another interesting project:

https://pypi.org/project/grin

Python 101: Redirecting stdout

Source: https://www.blog.pythonlibrary.org
Redirecting stdout to something most developers will need to do at some point or other. It can be useful to redirect stdout to a file or to a file-like object. I have also redirected stdout to a text control in some of my desktop GUI projects. In this article we will look at the following:
 • Redirecting stdout to a file (simple)
 • The Shell redirection method
 • Redirecting stdout using a custom context manager
 • Python 3’s contextlib.redirect_stdout()
 • Redirect stdout to a wxPython text control


Redirecting stdout

The easiest way to redirect stdout in Python is to just assign it an open file object. Let’s take a look at a simple example:
import sys
 
def redirect_to_file(text):
  original = sys.stdout
  sys.stdout = open('/path/to/redirect.txt', 'w')
  print('This is your redirected text:')
  print(text)
  sys.stdout = original
 
  print('This string goes to stdout, NOT the file!')
 
if __name__ == '__main__':Redirecting stdout / stderr
  redirect_to_file('Python rocks!')
Here we just import Python’s sys module and create a function that we can pass strings that we want to have redirected to a file. We save off a reference to sys.stdout so we can restore it at the end of the function. This can be useful if you intend to use stdout for other things. Before you run this code, be sure to update the path to something that will work on your system. When you run it, you should see the following in your file:

This is your redirected text:
Python rocks!

That last print statement will go to stdout, not the file.

Shell Redirection

Shell redirection is also pretty common, especially in Linux, although Windows also works the same way in most cases. Let’s create a silly example of a noisy function that we will call noisy.py:
# noisy.py
def noisy(text):
  print('The noisy function prints a lot')
  print('Here is the string you passed in:')
  print('*' * 40)
  print(text)
  print('*' * 40)
  print('Thank you for calling me!')
 
if __name__ == '__main__':
  noisy('This is a test of Python!')
You will notice that we didn’t import the sys module this time around. The reason is that we don’t need it since we will be using shell redirection. To do shell redirection, open a terminal (or command prompt) and navigate to the folder where you saved the code above. Then execute the following command:

python noisy.py > redirected.txt

The greater than character (i.e. >) tells your operating system to redirect stdout to the filename you specified. At this point you should have a file named “redirected.txt” in the same folder as your Python script. If you open it up, the file should have the following contents:

The noisy function prints a lot
Here is the string you passed in:
****************************************
This is a test of Python!
****************************************
Thank you for calling me!

Now wasn’t that pretty cool?

Redirect stdout with a context manager

Another fun way to redirect stdout is by using a context manager. Let’s create a custom context manager that accepts a file object to redirect stdout to:
import sys
from contextlib import contextmanager
 
 
@contextmanager
def custom_redirection(fileobj):
  old = sys.stdout
  sys.stdout = fileobj
  try:
    yield fileobj
  finally:
    sys.stdout = old
 
if __name__ == '__main__':
  with open('/path/to/custom_redir.txt', 'w') as out:
    with custom_redirection(out):
      print('This text is redirected to file')
      print('So is this string')
    print('This text is printed to stdout')
When you run this code, it will write out two lines of text to your file and one to stdout. As usual, we reset stdout at the end of the function.

Using contextlib.redirect_stdout

Python 3.4 added the redirect_stdout function to their contextlib module. Let’s try using that to create a context manager to redirect stdout:
import sys
from contextlib import redirect_stdout
 
def redirected(text, path):
  with open(path, 'w') as out:
    with redirect_stdout(out):
      print('Here is the string you passed in:')
      print('*' * 40)
      print(text)
      print('*' * 40)
 
if __name__ == '__main__':
  path = '/path/to/red.txt'
  text = 'My test to redirect'
  redirected(text, path)
This code is a little simpler because the built-in function does all the yielding and resetting of stdout automatically for you. Otherwise, it works in pretty much the same way as our custom context manager.

Redirecting stdout in wxPython

wxredirect

import sys
import wx
 
class MyForm(wx.Frame):
 
  def __init__(self):
    wx.Frame.__init__(self, None,
             title="wxPython Redirect Tutorial")
 
    # Add a panel so it looks the correct on all platforms
    panel = wx.Panel(self, wx.ID_ANY)
    style = wx.TE_MULTILINE|wx.TE_READONLY|wx.HSCROLL
    log = wx.TextCtrl(panel, wx.ID_ANY, size=(300,100),
             style=style)
    btn = wx.Button(panel, wx.ID_ANY, 'Push me!')
    self.Bind(wx.EVT_BUTTON, self.onButton, btn)
 
    # Add widgets to a sizer
    sizer = wx.BoxSizer(wx.VERTICAL)
    sizer.Add(log, 1, wx.ALL|wx.EXPAND, 5)
    sizer.Add(btn, 0, wx.ALL|wx.CENTER, 5)
    panel.SetSizer(sizer)
 
    # redirect text here
    sys.stdout = log
 
  def onButton(self, event):
    print "You pressed the button!"
 
# Run the program
if __name__ == "__main__":
  app = wx.App(False)
  frame = MyForm().Show()
  app.MainLoop()

This code just creates a simple frame with a panel that contains a multi-line text control and a button. Whenever you press the button, it will print out some text to stdout, which we have redirected to the text control.

Personally I thought it was cool that Python 3 now has a context manager built-in just for this purpose. Speaking of which, Python 3 also has a function for redirecting stderr. All of these examples can be modified slightly to support redirecting stderr or both stdout and stderr. The very last thing we touched on was redirecting stdout to a text control in wxPython. This can be really useful for debugging or for grabbing the output from a subprocess, although in the latter case you will need to print out the output to have it redirected correctly.

Related Reading

Grep.py

Script: grep.py

Search regular expression in buffers or log files. (for WeeChat ≥ 1.5)
Author: m4v — Version: 0.8.1 — License: GPL3.
Added: 2009-08-18, updated: 2018-04-10.
Other scripts: https://weechat.org/scripts/
Source: https://weechat.org/scripts/source/grep.py.html/

# -*- coding: utf-8 -*-
###
# Copyright (c) 2009-2011 by Elián Hanisch <lambdae2@gmail.com>
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
###

###
# Search in Weechat buffers and logs (for Weechat 0.3.*)
#
#  Inspired by xt's grep.py
#  Originally I just wanted to add some fixes in grep.py, but then
#  I got carried away and rewrote everything, so new script.
#
#  Commands:
#  * /grep
#   Search in logs or buffers, see /help grep
#  * /logs:
#   Lists logs in ~/.weechat/logs, see /help logs
#
#  Settings:
#  * plugins.var.python.grep.clear_buffer:
#   Clear the results buffer before each search. Valid values: on, off
#
#  * plugins.var.python.grep.go_to_buffer:
#   Automatically go to grep buffer when search is over. Valid values: on, off
#
#  * plugins.var.python.grep.log_filter:
#   Coma separated list of patterns that grep will use for exclude logs, e.g.
#   if you use '*server/*' any log in the 'server' folder will be excluded
#   when using the command '/grep log'
#
#  * plugins.var.python.grep.show_summary:
#   Shows summary for each log. Valid values: on, off
#
#  * plugins.var.python.grep.max_lines:
#   Grep will only print the last matched lines that don't surpass the value defined here.
#
#  * plugins.var.python.grep.size_limit:
#   Size limit in KiB, is used for decide whenever grepping should run in background or not. If
#   the logs to grep have a total size bigger than this value then grep run as a new process.
#   It can be used for force or disable background process, using '0' forces to always grep in
#   background, while using '' (empty string) will disable it.
#
#  * plugins.var.python.grep.timeout_secs:
#   Timeout (in seconds) for background grepping.
#
#  * plugins.var.python.grep.default_tail_head:
#   Config option for define default number of lines returned when using --head or --tail options.
#   Can be overriden in the command with --number option.
#
#
#  TODO:
#  * try to figure out why hook_process chokes in long outputs (using a tempfile as a
#  workaround now)
#  * possibly add option for defining time intervals
#
#
#  History:
#
#  2018-04-10, Sébastien Helleu <flashcode@flashtux.org>
#  version 0.8.1: fix infolist_time for WeeChat >= 2.2 (WeeChat returns a long
#         integer instead of a string)
#
#  2017-09-20, mickael9
#  version 0.8:
#  * use weechat 1.5+ api for background processing (old method was unsafe and buggy)
#  * add timeout_secs setting (was previously hardcoded to 5 mins)
#
#  2017-07-23, Sébastien Helleu <flashcode@flashtux.org>
#  version 0.7.8: fix modulo by zero when nick is empty string
#
#  2016-06-23, mickael9
#  version 0.7.7: fix get_home function
#
#  2015-11-26
#  version 0.7.6: fix a typo
#
#  2015-01-31, Nicd-
#  version 0.7.5:
#  '~' is now expaned to the home directory in the log file path so
#  paths like '~/logs/' should work.
#
#  2015-01-14, nils_2
#  version 0.7.4: make q work to quit grep buffer (requested by: gb)
#
#  2014-03-29, Felix Eckhofer <felix@tribut.de>
#  version 0.7.3: fix typo
#
#  2011-01-09
#  version 0.7.2: bug fixes
#
#  2010-11-15
#  version 0.7.1:
#  * use TempFile so temporal files are guaranteed to be deleted.
#  * enable Archlinux workaround.
#
#  2010-10-26
#  version 0.7:
#  * added templates.
#  * using --only-match shows only unique strings.
#  * fixed bug that inverted -B -A switches when used with -t
#
#  2010-10-14
#  version 0.6.8: by xt <xt@bash.no>
#  * supress highlights when printing in grep buffer
#
#  2010-10-06
#  version 0.6.7: by xt <xt@bash.no> 
#  * better temporary file:
#  use tempfile.mkstemp. to create a temp file in log dir, 
#  makes it safer with regards to write permission and multi user
#
#  2010-04-08
#  version 0.6.6: bug fixes
#  * use WEECHAT_LIST_POS_END in log file completion, makes completion faster
#  * disable bytecode if using python 2.6
#  * use single quotes in command string
#  * fix bug that could change buffer's title when using /grep stop
#
#  2010-01-24
#  version 0.6.5: disable bytecode is a 2.6 feature, instead, resort to delete the bytecode manually
#
#  2010-01-19
#  version 0.6.4: bug fix
#  version 0.6.3: added options --invert --only-match (replaces --exact, which is still available
#  but removed from help)
#  * use new 'irc_nick_color' info
#  * don't generate bytecode when spawning a new process
#  * show active options in buffer title
#
#  2010-01-17
#  version 0.6.2: removed 2.6-ish code
#  version 0.6.1: fixed bug when grepping in grep's buffer
#
#  2010-01-14
#  version 0.6.0: implemented grep in background
#  * improved context lines presentation.
#  * grepping for big (or many) log files runs in a weechat_process.
#  * added /grep stop.
#  * added 'size_limit' option
#  * fixed a infolist leak when grepping buffers
#  * added 'default_tail_head' option
#  * results are sort by line count
#  * don't die if log is corrupted (has NULL chars in it)
#  * changed presentation of /logs
#  * log path completion doesn't suck anymore
#  * removed all tabs, because I learned how to configure Vim so that spaces aren't annoying
#  anymore. This was the script's original policy.
#
#  2010-01-05
#  version 0.5.5: rename script to 'grep.py' (FlashCode <flashcode@flashtux.org>).
#
#  2010-01-04
#  version 0.5.4.1: fix index error when using --after/before-context options.
#
#  2010-01-03
#  version 0.5.4: new features
#  * added --after-context and --before-context options.
#  * added --context as a shortcut for using both -A -B options.
#
#  2009-11-06
#  version 0.5.3: improvements for long grep output
#  * grep buffer input accepts the same flags as /grep for repeat a search with different
#   options.
#  * tweaks in grep's output.
#  * max_lines option added for limit grep's output.
#  * code in update_buffer() optimized.
#  * time stats in buffer title.
#  * added go_to_buffer config option.
#  * added --buffer for search only in buffers.
#  * refactoring.
#
#  2009-10-12, omero
#  version 0.5.2: made it python-2.4.x compliant
#
#  2009-08-17
#  version 0.5.1: some refactoring, show_summary option added.
#
#  2009-08-13
#  version 0.5: rewritten from xt's grep.py
#  * fixed searching in non weechat logs, for cases like, if you're
#   switching from irssi and rename and copy your irssi logs to %h/logs
#  * fixed "timestamp rainbow" when you /grep in grep's buffer
#  * allow to search in other buffers other than current or in logs
#   of currently closed buffers with cmd 'buffer'
#  * allow to search in any log file in %h/logs with cmd 'log'
#  * added --count for return the number of matched lines
#  * added --matchcase for case sensible search
#  * added --hilight for color matches
#  * added --head and --tail options, and --number
#  * added command /logs for list files in %h/logs
#  * added config option for clear the buffer before a search
#  * added config option for filter logs we don't want to grep
#  * added the posibility to repeat last search with another regexp by writing
#   it in grep's buffer
#  * changed spaces for tabs in the code, which is my preference
#
###

from os import path
import sys, getopt, time, os, re

try:
  import cPickle as pickle
except ImportError:
  import pickle

try:
  import weechat
  from weechat import WEECHAT_RC_OK, prnt, prnt_date_tags
  import_ok = True
except ImportError:
  import_ok = False

SCRIPT_NAME  = "grep"
SCRIPT_AUTHOR = "Elián Hanisch <lambdae2@gmail.com>"
SCRIPT_VERSION = "0.8.1"
SCRIPT_LICENSE = "GPL3"
SCRIPT_DESC  = "Search in buffers and logs"
SCRIPT_COMMAND = "grep"

### Default Settings ###
settings = {
  'clear_buffer'   : 'off',
  'log_filter'    : '',
  'go_to_buffer'   : 'on',
  'max_lines'     : '4000',
  'show_summary'   : 'on',
  'size_limit'    : '2048',
  'default_tail_head' : '10',
  'timeout_secs'   : '300',
}

### Class definitions ###
class linesDict(dict):
  """
  Class for handling matched lines in more than one buffer.
  linesDict[buffer_name] = matched_lines_list
  """
  def __setitem__(self, key, value):
    assert isinstance(value, list)
    if key not in self:
      dict.__setitem__(self, key, value)
    else:
      dict.__getitem__(self, key).extend(value)

  def get_matches_count(self):
    """Return the sum of total matches stored."""
    if dict.__len__(self):
      return sum(map(lambda L: L.matches_count, self.itervalues()))
    else:
      return 0

  def __len__(self):
    """Return the sum of total lines stored."""
    if dict.__len__(self):
      return sum(map(len, self.itervalues()))
    else:
      return 0

  def __str__(self):
    """Returns buffer count or buffer name if there's just one stored."""
    n = len(self.keys())
    if n == 1:
      return self.keys()[0]
    elif n > 1:
      return '%s logs' %n
    else:
      return ''

  def items(self):
    """Returns a list of items sorted by line count."""
    items = dict.items(self)
    items.sort(key=lambda i: len(i[1]))
    return items

  def items_count(self):
    """Returns a list of items sorted by match count."""
    items = dict.items(self)
    items.sort(key=lambda i: i[1].matches_count)
    return items

  def strip_separator(self):
    for L in self.itervalues():
      L.strip_separator()

  def get_last_lines(self, n):
    total_lines = len(self)
    #debug('total: %s n: %s' %(total_lines, n))
    if n >= total_lines:
      # nothing to do
      return
    for k, v in reversed(self.items()):
      l = len(v)
      if n > 0:
        if l > n:
          del v[:l-n]
          v.stripped_lines = l-n
        n -= l
      else:
        del v[:]
        v.stripped_lines = l

class linesList(list):
  """Class for list of matches, since sometimes I need to add lines that aren't matches, I need an
  independent counter."""
  _sep = '...'
  def __init__(self, *args):
    list.__init__(self, *args)
    self.matches_count = 0
    self.stripped_lines = 0

  def append(self, item):
    """Append lines, can be a string or a list with strings."""
    if isinstance(item, str):
      list.append(self, item)
    else:
      self.extend(item)

  def append_separator(self):
    """adds a separator into the list, makes sure it doen't add two together."""
    s = self._sep
    if (self and self[-1] != s) or not self:
      self.append(s)

  def onlyUniq(self):
    s = set(self)
    del self[:]
    self.extend(s)

  def count_match(self, item=None):
    if item is None or isinstance(item, str):
      self.matches_count += 1
    else:
      self.matches_count += len(item)

  def strip_separator(self):
    """removes separators if there are first or/and last in the list."""
    if self:
      s = self._sep
      if self[0] == s:
        del self[0]
      if self[-1] == s:
        del self[-1]

### Misc functions ###
now = time.time
def get_size(f):
  try:
    return os.stat(f).st_size
  except OSError:
    return 0

sizeDict = {0:'b', 1:'KiB', 2:'MiB', 3:'GiB', 4:'TiB'}
def human_readable_size(size):
  power = 0
  while size > 1024:
    power += 1
    size /= 1024.0
  return '%.2f %s' %(size, sizeDict.get(power, ''))

def color_nick(nick):
  """Returns coloured nick, with coloured mode if any."""
  if not nick: return ''
  wcolor = weechat.color
  config_string = lambda s : weechat.config_string(weechat.config_get(s))
  config_int = lambda s : weechat.config_integer(weechat.config_get(s))
  # prefix and suffix
  prefix = config_string('irc.look.nick_prefix')
  suffix = config_string('irc.look.nick_suffix')
  prefix_c = suffix_c = wcolor(config_string('weechat.color.chat_delimiters'))
  if nick[0] == prefix:
    nick = nick[1:]
  else:
    prefix = prefix_c = ''
  if nick[-1] == suffix:
    nick = nick[:-1]
    suffix = wcolor(color_delimiter) + suffix
  else:
    suffix = suffix_c = ''
  # nick mode
  modes = '@!+%'
  if nick[0] in modes:
    mode, nick = nick[0], nick[1:]
    mode_color = wcolor(config_string('weechat.color.nicklist_prefix%d' \
      %(modes.find(mode) + 1)))
  else:
    mode = mode_color = ''
  # nick color
  nick_color = ''
  if nick:
    nick_color = weechat.info_get('irc_nick_color', nick)
    if not nick_color:
      # probably we're in WeeChat 0.3.0
      #debug('no irc_nick_color')
      color_nicks_number = config_int('weechat.look.color_nicks_number')
      idx = (sum(map(ord, nick))%color_nicks_number) + 1
      nick_color = wcolor(config_string('weechat.color.chat_nick_color%02d' %idx))
  return ''.join((prefix_c, prefix, mode_color, mode, nick_color, nick, suffix_c, suffix))

### Config and value validation ###
boolDict = {'on':True, 'off':False}
def get_config_boolean(config):
  value = weechat.config_get_plugin(config)
  try:
    return boolDict[value]
  except KeyError:
    default = settings[config]
    error("Error while fetching config '%s'. Using default value '%s'." %(config, default))
    error("'%s' is invalid, allowed: 'on', 'off'" %value)
    return boolDict[default]

def get_config_int(config, allow_empty_string=False):
  value = weechat.config_get_plugin(config)
  try:
    return int(value)
  except ValueError:
    if value == '' and allow_empty_string:
      return value
    default = settings[config]
    error("Error while fetching config '%s'. Using default value '%s'." %(config, default))
    error("'%s' is not a number." %value)
    return int(default)

def get_config_log_filter():
  filter = weechat.config_get_plugin('log_filter')
  if filter:
    return filter.split(',')
  else:
    return []

def get_home():
  home = weechat.config_string(weechat.config_get('logger.file.path'))
  home = home.replace('%h', weechat.info_get('weechat_dir', ''))
  home = path.abspath(path.expanduser(home))
  return home

def strip_home(s, dir=''):
  """Strips home dir from the begging of the log path, this makes them sorter."""
  if not dir:
    global home_dir
    dir = home_dir
  l = len(dir)
  if s[:l] == dir:
    return s[l:]
  return s

### Messages ###
script_nick = SCRIPT_NAME
def error(s, buffer=''):
  """Error msg"""
  prnt(buffer, '%s%s %s' %(weechat.prefix('error'), script_nick, s))
  if weechat.config_get_plugin('debug'):
    import traceback
    if traceback.sys.exc_type:
      trace = traceback.format_exc()
      prnt('', trace)

def say(s, buffer=''):
  """normal msg"""
  prnt_date_tags(buffer, 0, 'no_highlight', '%s\t%s' %(script_nick, s))### Log files and buffers ###
cache_dir = {} # note: don't remove, needed for completion if the script was loaded recently
def dir_list(dir, filter_list=(), filter_excludes=True, include_dir=False):
  """Returns a list of files in 'dir' and its subdirs."""
  global cache_dir
  from os import walk
  from fnmatch import fnmatch
  #debug('dir_list: listing in %s' %dir)
  key = (dir, include_dir)
  try:
    return cache_dir[key]
  except KeyError:
    pass
  
  filter_list = filter_list or get_config_log_filter()
  dir_len = len(dir)
  if filter_list:
    def filter(file):
      file = file[dir_len:] # pattern shouldn't match home dir
      for pattern in filter_list:
        if fnmatch(file, pattern):
          return filter_excludes
      return not filter_excludes
  else:
    filter = lambda f : not filter_excludes

  file_list = []
  extend = file_list.extend
  join = path.join
  def walk_path():
    for basedir, subdirs, files in walk(dir):
      #if include_dir:
      #  subdirs = map(lambda s : join(s, ''), subdirs)
      #  files.extend(subdirs)
      files_path = map(lambda f : join(basedir, f), files)
      files_path = [ file for file in files_path if not filter(file) ]
      extend(files_path)

  walk_path()
  cache_dir[key] = file_list
  #debug('dir_list: got %s' %str(file_list))
  return file_list

def get_file_by_pattern(pattern, all=False):
  """Returns the first log whose path matches 'pattern',
  if all is True returns all logs that matches."""
  if not pattern: return []
  #debug('get_file_by_filename: searching for %s.' %pattern)
  # do envvar expandsion and check file
  file = path.expanduser(pattern)
  file = path.expandvars(file)
  if path.isfile(file):
    return [file]
  # lets see if there's a matching log
  global home_dir
  file = path.join(home_dir, pattern)
  if path.isfile(file):
    return [file]
  else:
    from fnmatch import fnmatch
    file = []
    file_list = dir_list(home_dir)
    n = len(home_dir)
    for log in file_list:
      basename = log[n:]
      if fnmatch(basename, pattern):
        file.append(log)
    #debug('get_file_by_filename: got %s.' %file)
    if not all and file:
      file.sort()
      return [ file[-1] ]
    return file

def get_file_by_buffer(buffer):
  """Given buffer pointer, finds log's path or returns None."""
  #debug('get_file_by_buffer: searching for %s' %buffer)
  infolist = weechat.infolist_get('logger_buffer', '', '')
  if not infolist: return
  try:
    while weechat.infolist_next(infolist):
      pointer = weechat.infolist_pointer(infolist, 'buffer')
      if pointer == buffer:
        file = weechat.infolist_string(infolist, 'log_filename')
        if weechat.infolist_integer(infolist, 'log_enabled'):
          #debug('get_file_by_buffer: got %s' %file)
          return file
        #else:
        #  debug('get_file_by_buffer: got %s but log not enabled' %file)
  finally:
    #debug('infolist gets freed')
    weechat.infolist_free(infolist)

def get_file_by_name(buffer_name):
  """Given a buffer name, returns its log path or None. buffer_name should be in 'server.#channel'
  or '#channel' format."""
  #debug('get_file_by_name: searching for %s' %buffer_name)
  # common mask options
  config_masks = ('logger.mask.irc', 'logger.file.mask')
  # since there's no buffer pointer, we try to replace some local vars in mask, like $channel and
  # $server, then replace the local vars left with '*', and use it as a mask for get the path with
  # get_file_by_pattern
  for config in config_masks:
    mask = weechat.config_string(weechat.config_get(config))
    #debug('get_file_by_name: mask: %s' %mask)
    if '$name' in mask:
      mask = mask.replace('$name', buffer_name)
    elif '$channel' in mask or '$server' in mask:
      if '.' in buffer_name and \
          '#' not in buffer_name[:buffer_name.find('.')]: # the dot isn't part of the channel name
        #  ^ I'm asuming channel starts with #, i'm lazy.
        server, channel = buffer_name.split('.', 1)
      else:
        server, channel = '*', buffer_name
      if '$channel' in mask:
        mask = mask.replace('$channel', channel)
      if '$server' in mask:
        mask = mask.replace('$server', server)
    # change the unreplaced vars by '*'
    from string import letters
    if '%' in mask:
      # vars for time formatting
      mask = mask.replace('%', '$')
    if '$' in mask:
      masks = mask.split('$')
      masks = map(lambda s: s.lstrip(letters), masks)
      mask = '*'.join(masks)
      if mask[0] != '*':
        mask = '*' + mask
    #debug('get_file_by_name: using mask %s' %mask)
    file = get_file_by_pattern(mask)
    #debug('get_file_by_name: got file %s' %file)
    if file:
      return file
  return None

def get_buffer_by_name(buffer_name):
  """Given a buffer name returns its buffer pointer or None."""
  #debug('get_buffer_by_name: searching for %s' %buffer_name)
  pointer = weechat.buffer_search('', buffer_name)
  if not pointer:
    try:
      infolist = weechat.infolist_get('buffer', '', '')
      while weechat.infolist_next(infolist):
        short_name = weechat.infolist_string(infolist, 'short_name')
        name = weechat.infolist_string(infolist, 'name')
        if buffer_name in (short_name, name):
          #debug('get_buffer_by_name: found %s' %name)
          pointer = weechat.buffer_search('', name)
          return pointer
    finally:
      weechat.infolist_free(infolist)
  #debug('get_buffer_by_name: got %s' %pointer)
  return pointer

def get_all_buffers():
  """Returns list with pointers of all open buffers."""
  buffers = []
  infolist = weechat.infolist_get('buffer', '', '')
  while weechat.infolist_next(infolist):
    buffers.append(weechat.infolist_pointer(infolist, 'pointer'))
  weechat.infolist_free(infolist)
  grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
  if grep_buffer and grep_buffer in buffers:
    # remove it from list
    del buffers[buffers.index(grep_buffer)]
  return buffers

### Grep ###
def make_regexp(pattern, matchcase=False):
  """Returns a compiled regexp."""
  if pattern in ('.', '.*', '.?', '.+'):
    # because I don't need to use a regexp if we're going to match all lines
    return None
  # matching takes a lot more time if pattern starts or ends with .* and it isn't needed.
  if pattern[:2] == '.*':
    pattern = pattern[2:]
  if pattern[-2:] == '.*':
    pattern = pattern[:-2]
  try:
    if not matchcase:
      regexp = re.compile(pattern, re.IGNORECASE)
    else:
      regexp = re.compile(pattern)
  except Exception, e:
    raise Exception, 'Bad pattern, %s' %e
  return regexp

def check_string(s, regexp, hilight='', exact=False):
  """Checks 's' with a regexp and returns it if is a match."""
  if not regexp:
    return s

  elif exact:
    matchlist = regexp.findall(s)
    if matchlist:
      if isinstance(matchlist[0], tuple):
        # join tuples (when there's more than one match group in regexp)
        return [ ' '.join(t) for t in matchlist ]
      return matchlist

  elif hilight:
    matchlist = regexp.findall(s)
    if matchlist:
      if isinstance(matchlist[0], tuple):
        # flatten matchlist
        matchlist = [ item for L in matchlist for item in L if item ]
      matchlist = list(set(matchlist)) # remove duplicates if any
      # apply hilight
      color_hilight, color_reset = hilight.split(',', 1)
      for m in matchlist:
        s = s.replace(m, '%s%s%s' % (color_hilight, m, color_reset))
      return s

  # no need for findall() here
  elif regexp.search(s):
    return s

def grep_file(file, head, tail, after_context, before_context, count, regexp, hilight, exact, invert):
  """Return a list of lines that match 'regexp' in 'file', if no regexp returns all lines."""
  if count:
    tail = head = after_context = before_context = False
    hilight = ''
  elif exact:
    before_context = after_context = False
    hilight = ''
  elif invert:
    hilight = ''
  #debug(' '.join(map(str, (file, head, tail, after_context, before_context))))

  lines = linesList()
  # define these locally as it makes the loop run slightly faster
  append = lines.append
  count_match = lines.count_match
  separator = lines.append_separator
  if invert:
    def check(s):
      if check_string(s, regexp, hilight, exact):
        return None
      else:
        return s
  else:
    check = lambda s: check_string(s, regexp, hilight, exact)
  
  try:
    file_object = open(file, 'r')
  except IOError:
    # file doesn't exist
    return lines
  if tail or before_context:
    # for these options, I need to seek in the file, but is slower and uses a good deal of
    # memory if the log is too big, so we do this *only* for these options.
    file_lines = file_object.readlines()

    if tail:
      # instead of searching in the whole file and later pick the last few lines, we
      # reverse the log, search until count reached and reverse it again, that way is a lot
      # faster
      file_lines.reverse()
      # don't invert context switches
      before_context, after_context = after_context, before_context

    if before_context:
      before_context_range = range(1, before_context + 1)
      before_context_range.reverse()

    limit = tail or head

    line_idx = 0
    while line_idx < len(file_lines):
      line = file_lines[line_idx]
      line = check(line)
      if line:
        if before_context:
          separator()
          trimmed = False
          for id in before_context_range:
            try:
              context_line = file_lines[line_idx - id]
              if check(context_line):
                # match in before context, that means we appended these same lines in a
                # previous match, so we delete them merging both paragraphs
                if not trimmed:
                  del lines[id - before_context - 1:]
                  trimmed = True
              else:
                append(context_line)
            except IndexError:
              pass
        append(line)
        count_match(line)
        if after_context:
          id, offset = 0, 0
          while id < after_context + offset:
            id += 1
            try:
              context_line = file_lines[line_idx + id]
              _context_line = check(context_line)
              if _context_line:
                offset = id
                context_line = _context_line # so match is hilighted with --hilight
                count_match()
              append(context_line)
            except IndexError:
              pass
          separator()
          line_idx += id
        if limit and lines.matches_count >= limit:
          break
      line_idx += 1

    if tail:
      lines.reverse()
  else:
    # do a normal grep
    limit = head

    for line in file_object:
      line = check(line)
      if line:
        count or append(line)
        count_match(line)
        if after_context:
          id, offset = 0, 0
          while id < after_context + offset:
            id += 1
            try:
              context_line = file_object.next()
              _context_line = check(context_line)
              if _context_line:
                offset = id
                context_line = _context_line
                count_match()
              count or append(context_line)
            except StopIteration:
              pass
          separator()
        if limit and lines.matches_count >= limit:
          break

  file_object.close()
  return lines

def grep_buffer(buffer, head, tail, after_context, before_context, count, regexp, hilight, exact,
    invert):
  """Return a list of lines that match 'regexp' in 'buffer', if no regexp returns all lines."""
  lines = linesList()
  if count:
    tail = head = after_context = before_context = False
    hilight = ''
  elif exact:
    before_context = after_context = False
  #debug(' '.join(map(str, (tail, head, after_context, before_context, count, exact, hilight))))

  # Using /grep in grep's buffer can lead to some funny effects
  # We should take measures if that's the case
  def make_get_line_funcion():
    """Returns a function for get lines from the infolist, depending if the buffer is grep's or
    not."""
    string_remove_color = weechat.string_remove_color
    infolist_string = weechat.infolist_string
    grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
    if grep_buffer and buffer == grep_buffer:
      def function(infolist):
        prefix = infolist_string(infolist, 'prefix')
        message = infolist_string(infolist, 'message')
        if prefix: # only our messages have prefix, ignore it
          return None
        return message
    else:
      infolist_time = weechat.infolist_time
      def function(infolist):
        prefix = string_remove_color(infolist_string(infolist, 'prefix'), '')
        message = string_remove_color(infolist_string(infolist, 'message'), '')
        date = infolist_time(infolist, 'date')
        # since WeeChat 2.2, infolist_time returns a long integer
        # instead of a string
        if not isinstance(date, str):
          date = time.strftime('%F %T', time.localtime(int(date)))
        return '%s\t%s\t%s' %(date, prefix, message)
    return function
  get_line = make_get_line_funcion()

  infolist = weechat.infolist_get('buffer_lines', buffer, '')
  if tail:
    # like with grep_file() if we need the last few matching lines, we move the cursor to
    # the end and search backwards
    infolist_next = weechat.infolist_prev
    infolist_prev = weechat.infolist_next
  else:
    infolist_next = weechat.infolist_next
    infolist_prev = weechat.infolist_prev
  limit = head or tail

  # define these locally as it makes the loop run slightly faster
  append = lines.append
  count_match = lines.count_match
  separator = lines.append_separator
  if invert:
    def check(s):
      if check_string(s, regexp, hilight, exact):
        return None
      else:
        return s
  else:
    check = lambda s: check_string(s, regexp, hilight, exact)

  if before_context:
    before_context_range = range(1, before_context + 1)
    before_context_range.reverse()

  while infolist_next(infolist):
    line = get_line(infolist)
    if line is None: continue
    line = check(line)
    if line:
      if before_context:
        separator()
        trimmed = False
        for id in before_context_range:
          if not infolist_prev(infolist):
            trimmed = True
        for id in before_context_range:
          context_line = get_line(infolist)
          if check(context_line):
            if not trimmed:
              del lines[id - before_context - 1:]
              trimmed = True
          else:
            append(context_line)
          infolist_next(infolist)
      count or append(line)
      count_match(line)
      if after_context:
        id, offset = 0, 0
        while id < after_context + offset:
          id += 1
          if infolist_next(infolist):
            context_line = get_line(infolist)
            _context_line = check(context_line)
            if _context_line:
              context_line = _context_line
              offset = id
              count_match()
            append(context_line)
          else:
            # in the main loop infolist_next will start again an cause an infinite loop
            # this will avoid it
            infolist_next = lambda x: 0
        separator()
      if limit and lines.matches_count >= limit:
        break
  weechat.infolist_free(infolist)

  if tail:
    lines.reverse()
  return lines

### this is our main grep function
hook_file_grep = None
def show_matching_lines():
  """
  Greps buffers in search_in_buffers or files in search_in_files and updates grep buffer with the
  result.
  """
  global pattern, matchcase, number, count, exact, hilight, invert
  global tail, head, after_context, before_context
  global search_in_files, search_in_buffers, matched_lines, home_dir
  global time_start
  matched_lines = linesDict()
  #debug('buffers:%s \nlogs:%s' %(search_in_buffers, search_in_files))
  time_start = now()

  # buffers
  if search_in_buffers:
    regexp = make_regexp(pattern, matchcase)
    for buffer in search_in_buffers:
      buffer_name = weechat.buffer_get_string(buffer, 'name')
      matched_lines[buffer_name] = grep_buffer(buffer, head, tail, after_context,
          before_context, count, regexp, hilight, exact, invert)

  # logs
  if search_in_files:
    size_limit = get_config_int('size_limit', allow_empty_string=True)
    background = False
    if size_limit or size_limit == 0:
      size = sum(map(get_size, search_in_files))
      if size > size_limit * 1024:
        background = True
    elif size_limit == '':
      background = False

    regexp = make_regexp(pattern, matchcase)

    global grep_options, log_pairs
    grep_options = (head, tail, after_context, before_context,
            count, regexp, hilight, exact, invert)

    log_pairs = [(strip_home(log), log) for log in search_in_files]

    if not background:
      # run grep normally
      for log_name, log in log_pairs:
        matched_lines[log_name] = grep_file(log, *grep_options)
      buffer_update()
    else:
      global hook_file_grep, grep_stdout, grep_stderr, pattern_tmpl
      grep_stdout = grep_stderr = ''
      hook_file_grep = weechat.hook_process(
        'func:grep_process',
        get_config_int('timeout_secs') * 1000,
        'grep_process_cb',
        ''
      )
      if hook_file_grep:
        buffer_create("Searching for '%s' in %s worth of data..." % (
          pattern_tmpl,
          human_readable_size(size)
        ))
  else:
    buffer_update()


def grep_process(*args):
  result = {}
  try:
    global grep_options, log_pairs
    for log_name, log in log_pairs:
      result[log_name] = grep_file(log, *grep_options)
  except Exception, e:
    result = e

  return pickle.dumps(result)

grep_stdout = grep_stderr = ''

def grep_process_cb(data, command, return_code, out, err):
  global grep_stdout, grep_stderr, matched_lines, hook_file_grep

  grep_stdout += out
  grep_stderr += err

  def set_buffer_error(message):
    error(message)
    grep_buffer = buffer_create()
    title = weechat.buffer_get_string(grep_buffer, 'title')
    title = title + ' %serror' % color_title
    weechat.buffer_set(grep_buffer, 'title', title)

  if return_code == weechat.WEECHAT_HOOK_PROCESS_ERROR:
    set_buffer_error("Background grep timed out")
    hook_file_grep = None
    return WEECHAT_RC_OK

  elif return_code >= 0:
    hook_file_grep = None
    if grep_stderr:
      set_buffer_error(grep_stderr)
      return WEECHAT_RC_OK

    try:
      data = pickle.loads(grep_stdout)
      if isinstance(data, Exception):
        raise data
      matched_lines.update(data)
    except Exception, e:
      set_buffer_error(repr(e))
      return WEECHAT_RC_OK
    else:
      buffer_update()

  return WEECHAT_RC_OK

def get_grep_file_status():
  global search_in_files, matched_lines, time_start
  elapsed = now() - time_start
  if len(search_in_files) == 1:
    log = '%s (%s)' %(strip_home(search_in_files[0]),
        human_readable_size(get_size(search_in_files[0])))
  else:
    size = sum(map(get_size, search_in_files))
    log = '%s log files (%s)' %(len(search_in_files), human_readable_size(size))
  return 'Searching in %s, running for %.4f seconds. Interrupt it with "/grep stop" or "stop"' \
    ' in grep buffer.' %(log, elapsed)

### Grep buffer ###
def buffer_update():
  """Updates our buffer with new lines."""
  global pattern_tmpl, matched_lines, pattern, count, hilight, invert, exact
  time_grep = now()

  buffer = buffer_create()
  if get_config_boolean('clear_buffer'):
    weechat.buffer_clear(buffer)
  matched_lines.strip_separator() # remove first and last separators of each list
  len_total_lines = len(matched_lines)
  max_lines = get_config_int('max_lines')
  if not count and len_total_lines > max_lines:
    weechat.buffer_clear(buffer)

  def _make_summary(log, lines, note):
    return '%s matches "%s%s%s"%s in %s%s%s%s' \
        %(lines.matches_count, color_summary, pattern_tmpl, color_info,
         invert and ' (inverted)' or '',
         color_summary, log, color_reset, note)

  if count:
    make_summary = lambda log, lines : _make_summary(log, lines, ' (not shown)')
  else:
    def make_summary(log, lines):
      if lines.stripped_lines:
        if lines:
          note = ' (last %s lines shown)' %len(lines)
        else:
          note = ' (not shown)'
      else:
        note = ''
      return _make_summary(log, lines, note)

  global weechat_format
  if hilight:
    # we don't want colors if there's match highlighting
    format_line = lambda s : '%s %s %s' %split_line(s)
  else:
    def format_line(s):
      global nick_dict, weechat_format
      date, nick, msg = split_line(s)
      if weechat_format:
        try:
          nick = nick_dict[nick]
        except KeyError:
          # cache nick
          nick_c = color_nick(nick)
          nick_dict[nick] = nick_c
          nick = nick_c
        return '%s%s %s%s %s' %(color_date, date, nick, color_reset, msg)
      else:
        #no formatting
        return msg

  prnt(buffer, '\n')
  print_line('Search for "%s%s%s"%s in %s%s%s.' %(color_summary, pattern_tmpl, color_info,
    invert and ' (inverted)' or '', color_summary, matched_lines, color_reset),
      buffer)
  # print last <max_lines> lines
  if matched_lines.get_matches_count():
    if count:
      # with count we sort by matches lines instead of just lines.
      matched_lines_items = matched_lines.items_count()
    else:
      matched_lines_items = matched_lines.items()

    matched_lines.get_last_lines(max_lines)
    for log, lines in matched_lines_items:
      if lines.matches_count:
        # matched lines
        if not count:
          # print lines
          weechat_format = True
          if exact:
            lines.onlyUniq()
          for line in lines:
            #debug(repr(line))
            if line == linesList._sep:
              # separator
              prnt(buffer, context_sep)
            else:
              if '\x00' in line:
                # log was corrupted
                error("Found garbage in log '%s', maybe it's corrupted" %log)
                line = line.replace('\x00', '')
              prnt_date_tags(buffer, 0, 'no_highlight', format_line(line))

        # summary
        if count or get_config_boolean('show_summary'):
          summary = make_summary(log, lines)
          print_line(summary, buffer)

      # separator
      if not count and lines:
        prnt(buffer, '\n')
  else:
    print_line('No matches found.', buffer)

  # set title
  global time_start
  time_end = now()
  # total time
  time_total = time_end - time_start
  # percent of the total time used for grepping
  time_grep_pct = (time_grep - time_start)/time_total*100
  #debug('time: %.4f seconds (%.2f%%)' %(time_total, time_grep_pct))
  if not count and len_total_lines > max_lines:
    note = ' (last %s lines shown)' %len(matched_lines)
  else:
    note = ''
  title = "'q': close buffer | Search in %s%s%s %s matches%s | pattern \"%s%s%s\"%s %s | %.4f seconds (%.2f%%)" \
      %(color_title, matched_lines, color_reset, matched_lines.get_matches_count(), note,
       color_title, pattern_tmpl, color_reset, invert and ' (inverted)' or '', format_options(),
       time_total, time_grep_pct)
  weechat.buffer_set(buffer, 'title', title)

  if get_config_boolean('go_to_buffer'):
    weechat.buffer_set(buffer, 'display', '1')

  # free matched_lines so it can be removed from memory
  del matched_lines
  
def split_line(s):
  """Splits log's line 's' in 3 parts, date, nick and msg."""
  global weechat_format
  if weechat_format and s.count('\t') >= 2:
    date, nick, msg = s.split('\t', 2) # date, nick, message
  else:
    # looks like log isn't in weechat's format
    weechat_format = False # incoming lines won't be formatted
    date, nick, msg = '', '', s
  # remove tabs
  if '\t' in msg:
    msg = msg.replace('\t', '  ')
  return date, nick, msg

def print_line(s, buffer=None, display=False):
  """Prints 's' in script's buffer as 'script_nick'. For displaying search summaries."""
  if buffer is None:
    buffer = buffer_create()
  say('%s%s' %(color_info, s), buffer)
  if display and get_config_boolean('go_to_buffer'):
    weechat.buffer_set(buffer, 'display', '1')

def format_options():
  global matchcase, number, count, exact, hilight, invert
  global tail, head, after_context, before_context
  options = []
  append = options.append
  insert = options.insert
  chars = 'cHmov'
  for i, flag in enumerate((count, hilight, matchcase, exact, invert)):
    if flag:
      append(chars[i])

  if head or tail:
    n = get_config_int('default_tail_head')
    if head:
      append('h')
      if head != n:
        insert(-1, ' -')
        append('n')
        append(head)
    elif tail:
      append('t')
      if tail != n:
        insert(-1, ' -')
        append('n')
        append(tail)

  if before_context and after_context and (before_context == after_context):
    append(' -C')
    append(before_context)
  else:
    if before_context:
      append(' -B')
      append(before_context)
    if after_context:
      append(' -A')
      append(after_context)

  s = ''.join(map(str, options)).strip()
  if s and s[0] != '-':
    s = '-' + s
  return s

def buffer_create(title=None):
  """Returns our buffer pointer, creates and cleans the buffer if needed."""
  buffer = weechat.buffer_search('python', SCRIPT_NAME)
  if not buffer:
    buffer = weechat.buffer_new(SCRIPT_NAME, 'buffer_input', '', '', '')
    weechat.buffer_set(buffer, 'time_for_each_line', '0')
    weechat.buffer_set(buffer, 'nicklist', '0')
    weechat.buffer_set(buffer, 'title', title or 'grep output buffer')
    weechat.buffer_set(buffer, 'localvar_set_no_log', '1')
  elif title:
    weechat.buffer_set(buffer, 'title', title)
  return buffer

def buffer_input(data, buffer, input_data):
  """Repeats last search with 'input_data' as regexp."""
  try:
    cmd_grep_stop(buffer, input_data)
  except:
    return WEECHAT_RC_OK
  if input_data in ('q', 'Q'):
    weechat.buffer_close(buffer)
    return weechat.WEECHAT_RC_OK

  global search_in_buffers, search_in_files
  global pattern
  try:
    if pattern and (search_in_files or search_in_buffers):
      # check if the buffer pointers are still valid
      for pointer in search_in_buffers:
        infolist = weechat.infolist_get('buffer', pointer, '')
        if not infolist:
          del search_in_buffers[search_in_buffers.index(pointer)]
        weechat.infolist_free(infolist)
      try:
        cmd_grep_parsing(input_data)
      except Exception, e:
        error('Argument error, %s' %e, buffer=buffer)
        return WEECHAT_RC_OK
      try:
        show_matching_lines()
      except Exception, e:
        error(e)
  except NameError:
    error("There isn't any previous search to repeat.", buffer=buffer)
  return WEECHAT_RC_OK

### Commands ###
def cmd_init():
  """Resets global vars."""
  global home_dir, cache_dir, nick_dict
  global pattern_tmpl, pattern, matchcase, number, count, exact, hilight, invert
  global tail, head, after_context, before_context
  hilight = ''
  head = tail = after_context = before_context = invert = False
  matchcase = count = exact = False
  pattern_tmpl = pattern = number = None
  home_dir = get_home()
  cache_dir = {} # for avoid walking the dir tree more than once per command
  nick_dict = {} # nick cache for don't calculate nick color every time

def cmd_grep_parsing(args):
  """Parses args for /grep and grep input buffer."""
  global pattern_tmpl, pattern, matchcase, number, count, exact, hilight, invert
  global tail, head, after_context, before_context
  global log_name, buffer_name, only_buffers, all
  opts, args = getopt.gnu_getopt(args.split(), 'cmHeahtivn:bA:B:C:o', ['count', 'matchcase', 'hilight',
    'exact', 'all', 'head', 'tail', 'number=', 'buffer', 'after-context=', 'before-context=',
    'context=', 'invert', 'only-match'])
  #debug(opts, 'opts: '); debug(args, 'args: ')
  if len(args) >= 2:
    if args[0] == 'log':
      del args[0]
      log_name = args.pop(0)
    elif args[0] == 'buffer':
      del args[0]
      buffer_name = args.pop(0)

  def tmplReplacer(match):
    """This function will replace templates with regexps"""
    s = match.groups()[0]
    tmpl_args = s.split()
    tmpl_key, _, tmpl_args = s.partition(' ')
    try:
      template = templates[tmpl_key]
      if callable(template):
        r = template(tmpl_args)
        if not r:
          error("Template %s returned empty string "\
             "(WeeChat doesn't have enough data)." %t)
        return r
      else:
        return template
    except:
      return t

  args = ' '.join(args) # join pattern for keep spaces
  if args:
    pattern_tmpl = args 
    pattern = _tmplRe.sub(tmplReplacer, args)
    debug('Using regexp: %s', pattern)
  if not pattern:
    raise Exception, 'No pattern for grep the logs.'

  def positive_number(opt, val):
    try:
      number = int(val)
      if number < 0:
        raise ValueError
      return number
    except ValueError:
      if len(opt) == 1:
        opt = '-' + opt
      else:
        opt = '--' + opt
      raise Exception, "argument for %s must be a positive integer." %opt

  for opt, val in opts:
    opt = opt.strip('-')
    if opt in ('c', 'count'):
      count = not count
    elif opt in ('m', 'matchcase'):
      matchcase = not matchcase
    elif opt in ('H', 'hilight'):
      # hilight must be always a string!
      if hilight:
        hilight = ''
      else:
        hilight = '%s,%s' %(color_hilight, color_reset)
      # we pass the colors in the variable itself because check_string() must not use
      # weechat's module when applying the colors (this is for grep in a hooked process)
    elif opt in ('e', 'exact', 'o', 'only-match'):
      exact = not exact
      invert = False
    elif opt in ('a', 'all'):
      all = not all
    elif opt in ('h', 'head'):
      head = not head
      tail = False
    elif opt in ('t', 'tail'):
      tail = not tail
      head = False
    elif opt in ('b', 'buffer'):
      only_buffers = True
    elif opt in ('n', 'number'):
      number = positive_number(opt, val)
    elif opt in ('C', 'context'):
      n = positive_number(opt, val)
      after_context = n
      before_context = n
    elif opt in ('A', 'after-context'):
      after_context = positive_number(opt, val)
    elif opt in ('B', 'before-context'):
      before_context = positive_number(opt, val)
    elif opt in ('i', 'v', 'invert'):
      invert = not invert
      exact = False
  # number check
  if number is not None:
    if number == 0:
      head = tail = False
      number = None
    elif head:
      head = number
    elif tail:
      tail = number
  else:
    n = get_config_int('default_tail_head')
    if head:
      head = n
    elif tail:
      tail = n

def cmd_grep_stop(buffer, args):
  global hook_file_grep, pattern, matched_lines
  if hook_file_grep:
    if args == 'stop':
      weechat.unhook(hook_file_grep)
      hook_file_grep = None

      s = 'Search for \'%s\' stopped.' % pattern
      say(s, buffer)
      grep_buffer = weechat.buffer_search('python', SCRIPT_NAME)
      if grep_buffer:
        weechat.buffer_set(grep_buffer, 'title', s)
      matched_lines = {}
    else:
      say(get_grep_file_status(), buffer)
    raise Exception

def cmd_grep(data, buffer, args):
  """Search in buffers and logs."""
  global pattern, matchcase, head, tail, number, count, exact, hilight
  try:
    cmd_grep_stop(buffer, args)
  except:
    return WEECHAT_RC_OK

  if not args:
    weechat.command('', '/help %s' %SCRIPT_COMMAND)
    return WEECHAT_RC_OK

  cmd_init()
  global log_name, buffer_name, only_buffers, all
  log_name = buffer_name = ''
  only_buffers = all = False

  # parse
  try:
    cmd_grep_parsing(args)
  except Exception, e:
    error('Argument error, %s' %e)
    return WEECHAT_RC_OK

  # find logs
  log_file = search_buffer = None
  if log_name:
    log_file = get_file_by_pattern(log_name, all)
    if not log_file:
      error("Couldn't find any log for %s. Try /logs" %log_name)
      return WEECHAT_RC_OK
  elif all:
    search_buffer = get_all_buffers()
  elif buffer_name:
    search_buffer = get_buffer_by_name(buffer_name)
    if not search_buffer:
      # there's no buffer, try in the logs
      log_file = get_file_by_name(buffer_name)
      if not log_file:
        error("Logs or buffer for '%s' not found." %buffer_name)
        return WEECHAT_RC_OK
    else:
      search_buffer = [search_buffer]
  else:
    search_buffer = [buffer]

  # make the log list
  global search_in_files, search_in_buffers
  search_in_files = []
  search_in_buffers = []
  if log_file:
    search_in_files = log_file
  elif not only_buffers:
    #debug(search_buffer)
    for pointer in search_buffer:
      log = get_file_by_buffer(pointer)
      #debug('buffer %s log %s' %(pointer, log))
      if log:
        search_in_files.append(log)
      else:
        search_in_buffers.append(pointer)
  else:
    search_in_buffers = search_buffer

  # grepping
  try:
    show_matching_lines()
  except Exception, e:
    error(e)
  return WEECHAT_RC_OK

def cmd_logs(data, buffer, args):
  """List files in Weechat's log dir."""
  cmd_init()
  global home_dir
  sort_by_size = False
  filter = []

  try:
    opts, args = getopt.gnu_getopt(args.split(), 's', ['size'])
    if args:
      filter = args
    for opt, var in opts:
      opt = opt.strip('-')
      if opt in ('size', 's'):
        sort_by_size = True
  except Exception, e:
    error('Argument error, %s' %e)
    return WEECHAT_RC_OK

  # is there's a filter, filter_excludes should be False
  file_list = dir_list(home_dir, filter, filter_excludes=not filter)
  if sort_by_size:
    file_list.sort(key=get_size)
  else:
    file_list.sort()

  file_sizes = map(lambda x: human_readable_size(get_size(x)), file_list)
  # calculate column lenght
  if file_list:
    L = file_list[:]
    L.sort(key=len)
    bigest = L[-1]
    column_len = len(bigest) + 3
  else:
    column_len = ''

  buffer = buffer_create()
  if get_config_boolean('clear_buffer'):
    weechat.buffer_clear(buffer)
  file_list = zip(file_list, file_sizes)
  msg = 'Found %s logs.' %len(file_list)

  print_line(msg, buffer, display=True)
  for file, size in file_list:
    separator = column_len and '.'*(column_len - len(file))
    prnt(buffer, '%s %s %s' %(strip_home(file), separator, size))
  if file_list:
    print_line(msg, buffer)
  return WEECHAT_RC_OK


### Completion ###
def completion_log_files(data, completion_item, buffer, completion):
  #debug('completion: %s' %', '.join((data, completion_item, buffer, completion)))
  global home_dir
  l = len(home_dir)
  completion_list_add = weechat.hook_completion_list_add
  WEECHAT_LIST_POS_END = weechat.WEECHAT_LIST_POS_END
  for log in dir_list(home_dir):
    completion_list_add(completion, log[l:], 0, WEECHAT_LIST_POS_END)
  return WEECHAT_RC_OK

def completion_grep_args(data, completion_item, buffer, completion):
  for arg in ('count', 'all', 'matchcase', 'hilight', 'exact', 'head', 'tail', 'number', 'buffer',
      'after-context', 'before-context', 'context', 'invert', 'only-match'):
    weechat.hook_completion_list_add(completion, '--' + arg, 0, weechat.WEECHAT_LIST_POS_SORT)
  for tmpl in templates:
    weechat.hook_completion_list_add(completion, '%{' + tmpl, 0, weechat.WEECHAT_LIST_POS_SORT)
  return WEECHAT_RC_OK


### Templates ###
# template placeholder
_tmplRe = re.compile(r'%\{(\w+.*?)(?:\}|$)')
# will match 999.999.999.999 but I don't care
ipAddress = r'\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}'
domain = r'[\w-]{2,}(?:\.[\w-]{2,})*\.[a-z]{2,}'
url = r'\w+://(?:%s|%s)(?::\d+)?(?:/[^\])>\s]*)?' % (domain, ipAddress)

def make_url_regexp(args):
  #debug('make url: %s', args)
  if args:
    words = r'(?:%s)' %'|'.join(map(re.escape, args.split()))
    return r'(?:\w+://|www\.)[^\s]*%s[^\s]*(?:/[^\])>\s]*)?' %words
  else:
    return url

def make_simple_regexp(pattern):
  s = ''
  for c in pattern:
    if c == '*':
      s += '.*'
    elif c == '?':
      s += '.'
    else:
      s += re.escape(c)
  return s

templates = {
      'ip': ipAddress,
      'url': make_url_regexp,
    'escape': lambda s: re.escape(s),
    'simple': make_simple_regexp,
    'domain': domain,
    }

### Main ###
def delete_bytecode():
  global script_path
  bytecode = path.join(script_path, SCRIPT_NAME + '.pyc')
  if path.isfile(bytecode):
    os.remove(bytecode)
  return WEECHAT_RC_OK

if __name__ == '__main__' and import_ok and \
    weechat.register(SCRIPT_NAME, SCRIPT_AUTHOR, SCRIPT_VERSION, SCRIPT_LICENSE, \
    SCRIPT_DESC, 'delete_bytecode', ''):
  home_dir = get_home()

  # for import ourselves
  global script_path
  script_path = path.dirname(__file__)
  sys.path.append(script_path)
  delete_bytecode()

  # check python version
  import sys
  global bytecode
  if sys.version_info > (2, 6):
    bytecode = 'B'
  else:
    bytecode = ''


  weechat.hook_command(SCRIPT_COMMAND, cmd_grep.__doc__,
      "[log <file> | buffer <name> | stop] [-a|--all] [-b|--buffer] [-c|--count] [-m|--matchcase] "
      "[-H|--hilight] [-o|--only-match] [-i|-v|--invert] [(-h|--head)|(-t|--tail) [-n|--number <n>]] "
      "[-A|--after-context <n>] [-B|--before-context <n>] [-C|--context <n> ] <expression>",
# help
"""
   log <file>: Search in one log that matches <file> in the logger path.
         Use '*' and '?' as wildcards.
 buffer <name>: Search in buffer <name>, if there's no buffer with <name> it will
         try to search for a log file.
      stop: Stops a currently running search.
    -a --all: Search in all open buffers.
         If used with 'log <file>' search in all logs that matches <file>.
  -b --buffer: Search only in buffers, not in file logs.
   -c --count: Just count the number of matched lines instead of showing them.
 -m --matchcase: Don't do case insensitive search.
  -H --hilight: Colour exact matches in output buffer.
-o --only-match: Print only the matching part of the line (unique matches).
 -v -i --invert: Print lines that don't match the regular expression.
   -t --tail: Print the last 10 matching lines.
   -h --head: Print the first 10 matching lines.
-n --number <n>: Overrides default number of lines for --tail or --head.
-A --after-context <n>: Shows <n> lines of trailing context after matching lines.
-B --before-context <n>: Shows <n> lines of leading context before matching lines.
-C --context <n>: Same as using both --after-context and --before-context simultaneously.
 <expression>: Expression to search.

Grep buffer:
 Input line accepts most arguments of /grep, it'll repeat last search using the new
 arguments provided. You can't search in different logs from the buffer's input.
 Boolean arguments like --count, --tail, --head, --hilight, ... are toggleable

Python regular expression syntax:
 See http://docs.python.org/lib/re-syntax.html

Grep Templates:
   %{url [text]}: Matches anything like an url, or an url with text.
       %{ip}: Matches anything that looks like an ip.
     %{domain}: Matches anything like a domain.
  %{escape text}: Escapes text in pattern.
 %{simple pattern}: Converts a pattern with '*' and '?' wildcards into a regexp.

Examples:
 Search for urls with the word 'weechat' said by 'nick'
  /grep nick\\t.*%{url weechat}
 Search for '*.*' string
  /grep %{escape *.*}
""",
      # completion template
      "buffer %(buffers_names) %(grep_arguments)|%*"
      "||log %(grep_log_files) %(grep_arguments)|%*"
      "||stop"
      "||%(grep_arguments)|%*",
      'cmd_grep' ,'')
  weechat.hook_command('logs', cmd_logs.__doc__, "[-s|--size] [<filter>]",
      "-s --size: Sort logs by size.\n"
      " <filter>: Only show logs that match <filter>. Use '*' and '?' as wildcards.", '--size', 'cmd_logs', '')

  weechat.hook_completion('grep_log_files', "list of log files",
      'completion_log_files', '')
  weechat.hook_completion('grep_arguments', "list of arguments",
      'completion_grep_args', '')

  # settings
  for opt, val in settings.iteritems():
    if not weechat.config_is_set_plugin(opt):
      weechat.config_set_plugin(opt, val)

  # colors
  color_date    = weechat.color('brown')
  color_info    = weechat.color('cyan')
  color_hilight   = weechat.color('lightred')
  color_reset    = weechat.color('reset')
  color_title    = weechat.color('yellow')
  color_summary   = weechat.color('lightcyan')
  color_delimiter  = weechat.color('chat_delimiters')
  color_script_nick = weechat.color('chat_nick')
  
  # pretty [grep]
  script_nick = '%s[%s%s%s]%s' %(color_delimiter, color_script_nick, SCRIPT_NAME, color_delimiter,
      color_reset)
  script_nick_nocolor = '[%s]' %SCRIPT_NAME
  # paragraph separator when using context options
  context_sep = '%s\t%s--' %(script_nick, color_info)

  # -------------------------------------------------------------------------
  # Debug

  if weechat.config_get_plugin('debug'):
    try:
      # custom debug module I use, allows me to inspect script's objects.
      import pybuffer
      debug = pybuffer.debugBuffer(globals(), '%s_debug' % SCRIPT_NAME)
    except:
      def debug(s, *args):
        if not isinstance(s, basestring):
          s = str(s)
        if args:
          s = s %args
        prnt('', '%s\t%s' %(script_nick, s))
  else:
    def debug(*args):
      pass

# vim:set shiftwidth=4 tabstop=4 softtabstop=4 expandtab textwidth=100: