巨量資料所指的,是資料量一定要達到相當規模才能做的事(例如得到新觀點、創造新價值),沒有一定規模就無法實現,而且這些事將會改變現有市場、組織、市民與政府間的關係。
巨量資料的核心重點在於「預測」。……能有大量資料作為預測的基礎,此外,這些系統也必須能夠隨著時間自動改進,從新增的資料中,判斷出最佳的信號和模式。
讀後感|大數據
Mendel's First Law (ID: IPRB)
Problem
給定同型顯性合子、異型合子、同型隱性合子的數量,求隨機配對後子代呈現顯性表徵的機率
Given: Three positive integers k, m, and n, representing a population containing k+m+n organisms: k individuals are homozygous dominant for a factor, m are heterozygous, and n are homozygous recessive.
Return: The probability that two randomly selected mating organisms will produce an individual possessing a dominant allele (and thus displaying the dominant phenotype). Assume that any two organisms can mate.
Computing GC Content (ID: GC)
Problem
給定至多含 10 條 DNA 序列之 fasta 檔,求 GC 比最高者及其 GC 比。
Given: At most 10 DNA strings in FASTA format (of length at most 1 kbp each).
Return: The ID of the string having the highest GC-content, followed by the GC-content of that string. Rosalind allows for a default error of 0.001 in all decimal answers unless otherwise stated; please see the note on absolute error below.
Rabbits and Recurrence Relations (ID: FIB)
Problem
假設兔子需一個月性成熟,,性成熟後每對兔子每個月必繁殖 k 對子代,且不因任何因素死亡,則求 n 月後的兔子對數。
Given: Positive integers n≤40 and k≤5.
Return: The total number of rabbit pairs that will be present after n months, if we begin with 1 pair and in each generation, every pair of reproduction-age rabbits produces a litter of k rabbit pairs (instead of only 1 pair).