Approaches Used In Machine Translation

Print   

02 Nov 2017

Disclaimer:
This essay has been written and submitted by students and is not an example of our work. Please click this link to view samples of our professional work witten by our professional essay writers. Any opinions, findings, conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of EssayCompany.

Abstract

Really Simple Syndication (RSS) is an internet-based technology allowing website publishers to easily and frequently broadcast relevant content updates to subscribers. Business analysts looking for content relevant to their needs have thousands of RSS feeds to chose from, with each feed potentially syndicating dozens of articles each day. Here we propose to develop translate English RSS Feed to Hindi RSS Feed. The proposed translator should have modules (I) Lexical Parser, (II) Semantic Mapper (III) ITranslator (IV) Composer. Our concentrate on first two module ie lexical parser which parses a English RSS Feed and Semantic Mapper which maps English semantic word with Hindi Semantic word.

Keywords : Machine Translator, RSS

Introduction

The technology is reaching new height right from conception of ideas up to the practical implementation. It is important that equal emphasis is put to remove the language divide which cause communication gap among different sections of societies. Natural Language processing (NLP) is field that strives to fill the gap. Machine Translation (MT) mainly deals with transformation of one language to another. Coming to MT scenario in India, it has enormous scope due to many regional languages of India. The majority of the population in India is fluent in regional languages such as Hindi, Marathi etc. Given such a scenario MT can be used to provide an interface of regional language.

1.1 Background

Machine Translation is an application of computer and language sciences which helps in development of systems answering practical needs. Computer programs are producing translations which may not be perfect translations of literary texts, but produce useful translations of manuals, documents, prospectuses, memoranda and reports.

Really Simple Syndication (RSS) is a family of standardized web feeds which allow publishers to easily syndicate frequently updated documents. Subscribers to RSS Feeds can receive updated news headlines, blog entries, or other information, without having to visit the publishing website. Documents provided by RSS feeds are structured such that the main content, author and other document attributes are easily located and extracted. Most users need update from many websites for whose changes in content are an unpredictable. Ex. medical websites, weblogs news, community and religious organization information pages, and product information pages.

1.2 Purpose

RSS solves above problem for users who regularly use the www. It allows users to easily stay informed by retrieving the latest updated content from the sites that you are interested in. This will save time by not needing to visit each site individually. The RSS has software called

Feed Reader or News Aggregator which allows us to get the RSS feeds from various sites and display them for us which can be to read and use. These new are in English Language. This project will convert news sent by RSS feeds to user in their regional language (English – Hindi). This project is an attempt at constructing a machine translator integrated with RSS Aggregator present in Web Browser.

Related Work

2.1 Approaches used in Machine Translation

Broadly classifying, approaches used for translation are Rule based and Corpus based. Ruled Based Machine Translation (RBMT) is called as the rational approach. Rule based approach is further classified to direct, interlingual and transfer based approach. The first generation of MT System was direct translation. The second generations of MT system are indirect approach of interlingual and transfer based systems. Interlingua and transfer approaches are essentially based on the specification of rules i.e. (for morphology, syntax, lexical selection, semantic analysis, and generation).

Strength of RBMT- Information can be processed through introspection

Weakness of RBMT – Accuracy of the entire process is the product of accuracies of each sub stage.

In India, there are different machine Translation systems they are:-

Name of software

Machine Translation System

AnglaUrdu

Urdu to English

HindiAngla

Hindi to English

AnglaHindi

English to Hindi

AnglaBharati

English to Indian Language

2.2 RSS Feed

About Website, the basic idea of restructuring information were started in early as 1995, when  Ramanathan V. Guha  and others in Apple Computer's Advanced Technology Group developed the Meta Content Framework. RSS works with the website author which, maintain a list of notifications for their website in a standard way. This list of notifications is called an "RSS Feed". Users, who are interested in finding out the latest headlines or changes, can check this list. For this there is Special computer programs is installed called "RSS aggregators" have been developed that automatically access the RSS feeds of websites on your behalf and organize the results for us.

Two web servers each with an RSS file being checked by an aggregator

3. Motivation

Currently the rule based Machine Translation system contains 22 rules An example of Machine translation from English to Hindi is depicted in figure below.

3.1 Proposed Machine Translation System

The proposed Machine translation system to be developed consists of

Following are the steps

1. A process of analyzing input sentences (morphological, syntactic and/or semantic analysis)

2. A process of translating source language texts (English) to the corresponding target language text (Hindi) word –for-word and phrase-to phrase.

3. A process of re-organizing target language words according to the target language sentence format.

Block diagram of English- Hindi Rule Based Machine Translation.

3.2 Stanford Parser

It is used for four main purposes in MTS.

Syntactic Analysis

Part of Speech (POS) –

Stemming the words

Morphological Analysis

The Semantic standard representation was designed to provide a simple description of the grammatical relationships in sentence that can easily be understood and effectively used by people without linguistic expertise who want to extract textual relations.

Comparison of English and Hindi Word Order

ES

Raj

reads

Book

(subject)

(verb)

(object)

HS

Raj

Pustak

Pathata

Subject

Object

Verb

Pustak

Raj

Pathata

Object

Subject

Verb

Pathata

Pustak

Raj

Verb

Object

Subject

Thus Hindi sentence which can be written using Subject Verb Object (SVO), (Subject Object, Verb (SOV) and Verb Object Subject (VOS) order.

ALGORITHM

Algorithms are developed to translate English sentences to Hindi sentences based on rules and dictionary.

The basic algorithm is

Splitting up of a sentence into subject, object and verb:

1.1 Identify the verb.

1.2 After identification of verb, the whole sentence is spitted into SVO.

2. To determine Hindi meaning of a English word from the Dictionary:

3. Re arranging technique: Hindi language has the subject-object-verb (SOV) grammatical Structure unlike the English language, which has subject-verb-object (SVO) sentence structure. The basic technique of re arranging from phrase tree (English) to phrase tree (Hindi).

WORK DONE BASED ON THIS

RSS aggregator which fetch RSS content in a web browser. RSS aggregators check automatically a series of new items from RSS feeds as on an ongoing basis, keep track of changes into multiple websites. They detect whether the new item is the additions and if additions then they present in a compact and useful manner. RSS aggregator also keep track of the user items which they are interest for, if found, it add link can quickly and bring the related web page up for notice and reading.

Now, we use this proposed machine translation, is integrated with RSS aggregator program as shown in figure below

The RSS headlines (information) on a local news website could contain the following information which will be loaded in English language to RSS aggregator program. This will acts as input to the proposed RSS machine translator ie English news text and RSS machine translator will convert English News to Hindi news.

Example

Item 1:

 

  Title:

'Ryder's case put extra responsibility'

  Description:

Sunday, March 31, 2013 10:32 PM Delhi Daredevils' Unmukt Chand said the absence of Jesse Ryder, who is recovering in a hospital, would put additional responsibility on the team in IPL.

  Link:

http://da.feedsportal.com/fy/8at2Etc0ltqdA4NF/ia1.htm

This RSS machine Translator will be developed in Java Language which will acts as RSS MT plugin in webbrowser (Internet Explorer, FireFox, Chrome etc).



rev

Our Service Portfolio

jb

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

whatsapp

Do not panic, you are at the right place

jb

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now