
{"id":17211,"date":"2024-09-09T12:30:22","date_gmt":"2024-09-09T10:30:22","guid":{"rendered":"https:\/\/o-seznam.cz\/kariera\/458967-ai-vyzkumnik-generativnich-jazykovych-modelu\/"},"modified":"2026-04-23T22:10:41","modified_gmt":"2026-04-23T20:10:41","slug":"458967-ai-vyzkumnik-generativnich-jazykovych-modelu","status":"publish","type":"post","link":"https:\/\/o-seznam.cz\/kariera\/458967-ai-vyzkumnik-generativnich-jazykovych-modelu\/","title":{"rendered":"AI v\u00fdzkumn\u00edk generativn\u00edch jazykov\u00fdch model\u016f"},"content":{"rendered":"<hr>\n<p>Na\u0161\u00edm c\u00edlem je tvorba a rozvoj velk\u00fdch jazykov\u00fdch model\u016f pro intern\u00ed LLM platformu. Na vlastn\u00edm v\u00fdpo\u010detn\u00edm klastru s nejmodern\u011bj\u0161\u00edmi AI akceler\u00e1tory Nvidia H100 u\u010d\u00edme modely ve velikosti des\u00edtek miliard parametr\u016f. Na\u0161e zodpov\u011bdnost za\u010d\u00edn\u00e1 technick\u00fdm zprovozn\u011bn\u00edm distribuovan\u00e9ho u\u010den\u00ed a kon\u010d\u00ed p\u0159ed\u00e1n\u00edm modelu do produkce. Chyst\u00e1me experimenty, data, m\u011b\u0159en\u00ed a neobejdeme se bez neust\u00e1l\u00e9ho studia SOTA p\u0159\u00edstup\u016f.<\/p>\n<p>Prvn\u00ed generaci vlastn\u00edch model\u016f m\u00e1me nasazenou v provozu a modely d\u00e1le iterativn\u011b rozv\u00edj\u00edme v obecn\u00e9 kvalit\u011b i specifick\u00fdch vlastnostech, nap\u0159. velikost kontextu, function calling, strukturovan\u00fd v\u00fdstup, multimodalita nebo finetuning\/preference optimization pro konkr\u00e9tn\u00ed downstream task.\u00a0<\/p>\n<p>Jsme sou\u010d\u00e1st\u00ed odd\u011blen\u00ed v\u00fdzkumu na vyhled\u00e1v\u00e1n\u00ed, d\u00edky \u010demu\u017e m\u00e1me mo\u017enost sledovat \u0159adu projekt\u016f t\u00fdkaj\u00edc\u00edch se strojov\u00e9ho u\u010den\u00ed. Samotn\u00fd v\u00fdzkum velk\u00fdch jazykov\u00fdch model\u016f funguje distribuovan\u011b (Praha, Brno, Zl\u00edn), p\u0159esto je \u0159ada p\u0159\u00edle\u017eitost\u00ed k setk\u00e1n\u00ed cel\u00e9ho t\u00fdmu, a\u0165 u\u017e pracovn\u00edm nebo teambuildingov\u00e9m. \u00dazce spolupracujeme s MLOps t\u00fdmem, kter\u00fd se star\u00e1 o LLM platformu, na kter\u00e9 na\u0161e modely b\u011b\u017e\u00ed produk\u010dn\u011b.<\/p>\n<h2>Z\u00e1kladn\u00ed p\u0159edpoklady\u00a0<\/h2>\n<ul>\n<li>M\u00e1te netrivi\u00e1ln\u00ed zku\u0161enost s velk\u00fdmi jazykov\u00fdmi modely: ide\u00e1ln\u011b <strong>u\u010den\u00ed<\/strong> nebo <strong>vyhodnocov\u00e1n\u00ed<\/strong> <strong>LLM<\/strong>, p\u0159\u00edp. pokro\u010dil\u00e9 <strong>promptov\u00e1n\u00ed<\/strong>\u00a0<\/li>\n<li>M\u00e1te dobrou znalost strojov\u00e9ho u\u010den\u00ed, neuronov\u00fdch s\u00edt\u00ed a architektury Transformers\u00a0<\/li>\n<li>Um\u00edte programovat v <strong>Pythonu<\/strong>, v\u010d. znalosti algoritmizace\u00a0\u00a0<\/li>\n<li>Praxe na v\u00fdzkumn\u00e9 nebo obdobn\u00e9 pozici minim\u00e1ln\u011b 2 roky<\/li>\n<\/ul>\n<hr>\n<h2>Sou\u010d\u00e1st\u00ed pr\u00e1ce v\u00fdzkumn\u00edka pro LLM je<\/h2>\n<ul>\n<li>Tvorba experiment\u016f &#8211; PyTorch, HF Transformers\u00a0<\/li>\n<li>Spou\u0161t\u011bn\u00ed experiment\u016f v distrubuovan\u00e9m prost\u0159ed\u00ed (multi-node) &#8211; linux, Docker, k8s, DeepSpeed\/FSDP\u00a0<\/li>\n<li>P\u0159\u00edprava a anal\u00fdza dat \u2013 python, HF Datasets, Pandas, PySpark apod.\u00a0<\/li>\n<li>Tvorba metrik a vyhodnocov\u00e1n\u00ed modelu\u00a0<\/li>\n<li>Studium State-of-the-Art literatury\u00a0<\/li>\n<li>T\u00fdmov\u00e1 spolupr\u00e1ce p\u0159i \u0159e\u0161en\u00ed probl\u00e9mu a tvorb\u011b k\u00f3du \u2013 Git, code review\u00a0\u00a0<\/li>\n<\/ul>\n<h2>Co nab\u00edz\u00edme<\/h2>\n<ul>\n<li>Zaj\u00edmavou a rozmanitou pr\u00e1ci, \u0159e\u0161en\u00ed netrivi\u00e1ln\u00edch probl\u00e9m\u016f<\/li>\n<li>Mo\u017enost pod\u00edlet se na rozvoji velk\u00fdch jazykov\u00fdch model\u016f<\/li>\n<li>Pr\u00e1ce s velk\u00fdmi daty, mo\u017enost nechat si data anotovat<\/li>\n<li>Dost\u00e1v\u00e1me v\u011bci do produkce, p\u0159\u00edm\u00fd dopad na miliony u\u017eivatel\u016f<\/li>\n<li>Vlastn\u00ed klastr s kartami nvidia H100 a dal\u0161\u00ed v\u00fdkonn\u00fd hardware<\/li>\n<li>Osobn\u00ed rozvoj \u2013 vzd\u011bl\u00e1v\u00e1n\u00ed, reading groups a konference (v p\u0159\u00edpad\u011b z\u00e1jmu i ve\u0159ejn\u00e9 p\u0159edn\u00e1\u0161ky na univerzit\u00e1ch a akc\u00edch)<\/li>\n<li>Sd\u00edlen\u00ed znalost\u00ed a zku\u0161enost\u00ed nap\u0159\u00ed\u010d Seznamem<\/li>\n<li>Super kolektiv a neform\u00e1ln\u00ed a p\u0159\u00e1telsk\u00e9 prost\u0159ed\u00ed \u2013 v\u0161ichni si tady tyk\u00e1me a nem\u00e1me p\u0159edepsan\u00fd dress code<\/li>\n<\/ul>\n<h2>Jak prob\u00edh\u00e1 v\u00fdb\u011brov\u00e9 \u0159\u00edzen\u00ed<\/h2>\n<ul>\n<li>Za\u0161lete n\u00e1m sv\u016fj \u017eivotopis nebo odkaz na profesn\u00ed profil, kter\u00fd si pe\u010dliv\u011b prostudujeme.<\/li>\n<li>Pokud uvid\u00edme shodu, do t\u0159\u00ed dn\u016f se v\u00e1m telefonicky ozveme, slad\u00edme vz\u00e1jemn\u00e1 o\u010dek\u00e1v\u00e1n\u00ed a domluv\u00edme dal\u0161\u00ed postup.<\/li>\n<li>Dal\u0161\u00edm krokem je online test na platform\u011b Codility a zad\u00e1n\u00ed, kter\u00e9 odr\u00e1\u017e\u00ed pr\u00e1ci na\u0161eho v\u00fdzkumn\u00e9ho t\u00fdmu. Tyto \u010d\u00e1sti slou\u017e\u00ed jako prvn\u00ed technick\u00e1 zkou\u0161ka na cest\u011b k z\u00edsk\u00e1n\u00ed pozice.<\/li>\n<li>Po \u00fasp\u011b\u0161n\u00e9m absolvov\u00e1n\u00ed n\u00e1sleduje osobn\u00ed pohovor s vedouc\u00edm t\u00fdmu. Detailn\u011bji v\u00e1m p\u0159edstav\u00edme pozici a fungov\u00e1n\u00ed t\u00fdmu. Z\u00e1rove\u0148 se budeme pt\u00e1t na va\u0161e p\u0159edchoz\u00ed zku\u0161enosti a spole\u010dn\u011b projdeme v\u00e1\u0161 postup p\u0159i \u0159e\u0161en\u00ed zad\u00e1n\u00ed. Osobn\u00ed setk\u00e1n\u00ed obvykle trv\u00e1 p\u0159ibli\u017en\u011b 1,5\u20132 hodiny.<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Na\u0161\u00edm c\u00edlem je tvorba a rozvoj velk\u00fdch jazykov\u00fdch model\u016f pro intern\u00ed LLM platformu. Na vlastn\u00edm v\u00fdpo\u010detn\u00edm klastru s nejmodern\u011bj\u0161\u00edmi AI akceler\u00e1tory Nvidia H100 u\u010d\u00edme modely ve velikosti des\u00edtek miliard parametr\u016f. Na\u0161e zodpov\u011bdnost za\u010d\u00edn\u00e1 technick\u00fdm zprovozn\u011bn\u00edm distribuovan\u00e9ho u\u010den\u00ed a kon\u010d\u00ed p\u0159ed\u00e1n\u00edm modelu do produkce. Chyst\u00e1me experimenty, data, m\u011b\u0159en\u00ed a neobejdeme se bez neust\u00e1l\u00e9ho studia SOTA p\u0159\u00edstup\u016f. [&hellip;]<\/p>\n","protected":false},"author":0,"featured_media":0,"comment_status":"closed","ping_status":"","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_relevanssi_hide_post":"","_relevanssi_hide_content":"","_relevanssi_pin_for_all":"","_relevanssi_pin_keywords":"","_relevanssi_unpin_keywords":"","_relevanssi_related_keywords":"","_relevanssi_related_include_ids":"","_relevanssi_related_exclude_ids":"","_relevanssi_related_no_append":"","_relevanssi_related_not_related":"","_relevanssi_related_posts":"","_relevanssi_noindex_reason":"","footnotes":""},"categories":[239],"tags":[47,58],"location":[26,19,32],"technology":[68,75,93,94,203,52,200,114],"landing_category":[151,169,268,259,143,163],"class_list":["post-17211","post","type-post","status-publish","format-standard","hentry","category-it-a-technologie","tag-ozp","tag-plny-uvazek"],"acf":[],"_links":{"self":[{"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/posts\/17211","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/comments?post=17211"}],"version-history":[{"count":7,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/posts\/17211\/revisions"}],"predecessor-version":[{"id":20215,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/posts\/17211\/revisions\/20215"}],"wp:attachment":[{"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/media?parent=17211"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/categories?post=17211"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/tags?post=17211"},{"taxonomy":"location","embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/location?post=17211"},{"taxonomy":"technology","embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/technology?post=17211"},{"taxonomy":"landing_category","embeddable":true,"href":"https:\/\/o-seznam.cz\/kariera\/wp-json\/wp\/v2\/landing_category?post=17211"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}