Tuesday, February 7, 2012

awk: calculating frequency

Input:
иностранный язык 311
родной язык 226
настоящий друг 215
лаконичный ответ 204
милый друг 197
лучший друг 193
громкий голос 183
трава зеленая 171
упрямый осел 169
снег белый 158
передать привет 158
история учебник 13
история страны 13
история партии 13
истинный патриот 13
истинный друг 13
иностранный агент 13


output:
иностранный язык 311 0.959877
родной язык 226 1
настоящий друг 215 1
лаконичный ответ 204 1
милый друг 197 1
лучший друг 193 1
громкий голос 183 1
трава зеленая 171 1
упрямый осел 169 1
снег белый 158 1
передать привет 158 1
история учебник 13 0.333333
история страны 13 0.333333
история партии 13 0.333333
истинный патриот 13 0.5
истинный друг 13 0.5
иностранный агент 13 0.0401235

awk: awk -F' ' 'NR==FNR{a[$1]+=$3;next}{print($0,$3/a[$1])}' test test

Friday, February 3, 2012

Java: setting file encoding

Input: UTF8 file (cyrillic)

File file = new File("file_utf8");
StringBuffer buffer = new StringBuffer();
try
{
FileInputStream fin=null;
try {
fin = new FileInputStream(file);
} catch (FileNotFoundException e) {
e.printStackTrace();
}
InputStreamReader isr = new InputStreamReader(fin,"UTF8");

BufferedReader in= new BufferedReader(isr);
int ch;
while ((ch = in.read()) > -1) {
buffer.append((char)ch);
}
isr.close();
fin.close();
String[] lines=buffer.toString().split("\n");
for (String line : lines)
System.out.println(line);
}
catch (IOException e) {
e.printStackTrace();
}

OrientDB: console

1. Execute console: /bin/./console.sh
2. Create database:
create database database-url user password storage-type
Example:
create database local:/usr/local/orient/databases/demo/demo admin admin local
3. Connect to database:
connect database-url user-name user-password
connect local:../databases/demo/demo admin admin 
4. Select:
select conditions
select from person 
5. Insert:
insert into class|cluster:cluster (field-name*) values (field-value)
insert into Profile (name, surname) values ('Jay', 'Miner' ) 
6. Delete:
delete from class|cluster:cluster [where conditions]
delete from Profile where nick is null
delete from OGraphVertex where id = 0
7. Entities:
classes


Thursday, February 2, 2012

Django: multi language

1. FlatPages

This approach doesn't work any more (Django > 1.2) !

svn checkout http://django-multilingual.googlecode.com/svn/trunk/ django-multilingual-read-only
cd django-multilingual-read-only
sudo python setup.py install
Follow the instructions: 
http://devdoodles.wordpress.com/2009/02/26/static-content-with-django-multilingual-flatpages/
Check that SITE ID is correct!
2. Add dictionaries
1. Add trans or blocktrans to html you want to be transalted:
e.g., {% trans "Home" %}
2. also, add {% load i18n %} to the beginning of html.


3. Generate dictionaries (.po):
3.1 Set PYTHONPATH (export $PYTHONPATH=/path/to/your/project/)
3.2 Execute: django-admin.py makemessages -l ru . It will generate .po files in locale/ru/LC_MESSAGES
open the file and add translation. It will look like:
#: templates/flatpages/default.html:15
msgid "About"
msgstr "О нас"

3.3 Compile the files:
django-admin.py compilemessages

If you update the files you should use:makemessages -a
4. Add form to switch languges:
html:

        <form action="/i18n/setlang/" method="post"> 
         <input name="next" type="hidden" value="/" /> 
            <select name="language">
          {% for lang in LANGUAGES %
                 <option value="{{ lang.0 }}"<{{ lang.1 }}>
             {% endfor %}
              </select> 

             <input type="submit" value="Go" />
             </form>

Urls.py: (r'^i18n/', include('django.conf.urls.i18n')),

Settings.py:
LANGUAGES = (
        ('ru','Russian'),
        ('en','English'),
)



Wednesday, February 1, 2012

How the show all special chracters in VI

vi file

:set list - turns on
:set nolist - turns off

awk filtering and soring by parameter

file:
aaa \t bbb \t 3
bbb \t ssd \t 4

Exclude rows with $3 (3rd parameter) <=1 and sort it by 3rd column:

awk -F"\t" '$3>1 {print}' test | sort -n -r -t'     ' -k3> test_filter_gt1

Get number of different words in the 2nd column:
awk -F"      " '{print $2}' test_filter_gt1 | sort  | uniq -c -i | wc -l