From a2cd141b9f3b1abf15e5fdff738fc0160355880a Mon Sep 17 00:00:00 2001
From: pbrook <pbrook@138bc75d-0d04-0410-961f-82ee72b054a4>
Date: Tue, 3 Feb 2004 14:45:44 +0000
Subject: [PATCH] 	Merge from csl-arm-branch.

	2004-01-30  Paul Brook  <paul@codesourcery.com>

	* aof.h (REGISTER_NAMES): Add vfp reg names
	(ADDITIONAL_REGISTER_NAMES): Ditto.
	* aout.h (REGISTER_NAMES): Ditto.
	(ADDITIONAL_REGISTER_NAMES): Ditto.
	* arm-protos.h: Update/Add Prototypes.
	* arm.c (init_fp_table): Rename from init_fpa_table. Update users.
	Only allow 0.0 for VFP.
	(fp_consts_inited): Rename from fpa_consts_inited.  Update users.
	(values_fp): Rename from values_fpa.  Update Users.
	(arm_const_double_rtx): Rename from const_double_rtx_ok_for_fpa.
	Update users.  Only check valid constants for this hardware.
	(arm_float_rhs_operand): Rename from fpa_rhs_operand.  Update Users.
	Only allow consts for FPA.
	(arm_float_add_operand): Rename from fpa_add_operand.  Update users.
	Only allow consts for FPA.
	(use_return_insn): Check for saved VFP regs.
	(arm_legitimate_address_p): Handle VFP DFmode addressing.
	(arm_legitimize_address): Ditto.
	(arm_general_register_operand): New function.
	(vfp_mem_operand): New function.
	(vfp_compare_operand): New function.
	(vfp_secondary_reload_class): New function.
	(arm_float_compare_operand): New function.
	(vfp_print_multi): New function.
	(vfp_output_fstmx): New function.
	(vfp_emit_fstm): New function.
	(arm_output_epilogue): Output VPF reg restore code.
	(arm_expand_prologue): Output VFP reg save code.
	(arm_print_operand): Add 'P'.
	(arm_hard_regno_mode_ok): Return modes for VFP regs.
	(arm_regno_class): Return classes for VFP regs.
	(arm_compute_initial_elimination_offset): Include space for VFP regs.
	(arm_get_frame_size): Ditto.
	* arm.h (FIXED_REGISTERS): Add VFP regs.
	(CALL_USED_REGISTERS): Ditto.
	(CONDITIONAL_REGISTER_USAGE): Enable VFP regs.
	(FIRST_VFP_REGNUM): Define.
	(LAST_VFP_REGNUM): Define.
	(IS_VFP_REGNUM): Define.
	(FIRST_PSEUDO_REGISTER): Include VFP regs.
	(HARD_REGNO_NREGS): Handle VFP regs.
	(REG_ALLOC_ORDER): Add VFP regs.
	(enum reg_class): Add VFP_REGS.
	(REG_CLASS_NAMES): Ditto.
	(REG_CLASS_CONTENTS): Ditto.
	(CANNOT_CHANGE_MODE_CLASS) Handle VFP Regs.
	(REG_CLASS_FROM_LETTER): Add 'w'.
	(EXTRA_CONSTRAINT_ARM): Add 'U'.
	(EXTRA_MEMORY_CONSTRAINT): Define.
	(SECONDARY_OUTPUT_RELOAD_CLASS): Handle VFP regs.
	(SECONDARY_INPUT_RELOAD_CLASS): Ditto.
	(REGISTER_MOVE_COST): Ditto.
	(PREDICATE_CODES): Add arm_general_register_operand,
	arm_float_compare_operand and vfp_compare_operand.
	* arm.md (various): Rename as above.
	(divsf3): Enable when TARGET_VFP.
	(divdf3): Ditto.
	(movdfcc): Ditto.
	(sqrtsf2): Ditto.
	(sqrtdf2): Ditto.
	(arm_movdi): Disable when TARGET_VFP.
	(arm_movsi_insn): Ditto.
	(movsi): Only split with general regs.
	(cmpsf): Use arm_float_compare_operand.
	(push_fp_multi): Restrict to TARGET_FPA.
	(vfp.md): Include.
	* vfp.md: New file.
	* fpa.md (various): Rename as above.
	* doc/md.texi: Document ARM w and U constraints.

	2004-01-15  Paul Brook  <paul@codesourcery.com>

	* config.gcc: Add with_fpu.  Allow with-float=softfp.
	* config/arm/arm.c (arm_override_options): Rename *-s to *s.
	Break out of loop when we find a float-abi.  Fix typo.
	* config/arm/arm.h (OPTION_DEFAULT_SPECS): Add "fpu".
	Set -mfloat-abi=.
	* doc/install.texi: Document --with-fpu.

	2003-01-14  Paul Brook  <paul@codesourcery.com>

	* config.gcc (with_arch): Add armv6.
	* config/arm/arm.h: Rename TARGET_CPU_*_s to TARGET_CPU_*s.
	* config/arm/arm.c (arm_overrride_options): Ditto.

	2004-01-08  Richard Earnshaw  <rearnsha@arm.com>

	* arm.c (FL_ARCH3M): Renamed from FL_FAST_MULT.
	(FL_ARCH6): Renamed from FL_ARCH6J.
	(arm_arch3m): Renamed from arm_fast_multiply.
	(arm_arch6): Renamed from arm_arch6j.
	* arm.h: Update all uses of above.
	* arm-cores.def: Likewise.
	* arm.md: Likewise.

	* arm.h (CPP_CPU_ARCH_SPEC): Emit __ARM_ARCH_6J__ define for armV6j,
	not arm6j.  Add entry for arch armv6.

	2004-01-07  Richard Earnshaw  <rearnsha@arm.com>

	* arm.c (arm_emit_extendsi): Delete.
	* arm-protos.h (arm_emit_extendsi): Delete.
	* arm.md (zero_extendhisi2): Also handle zero-extension of
	non-subregs.
	(zero_extendqisi2, extendhisi2, extendqisi2): Likewise.
	(thumb_zero_extendhisi2): Only match if not v6.
	(arm_zero_extendhisi2, thumb_zero_extendqisi2, arm_zero_extendqisi2)
	(thumb_extendhisi2, arm_extendhisi2, arm_extendqisi)
	(thumb_extendqisi2): Likewise.
	(thumb_zero_extendhisi2_v6, arm_zero_extendhisi2_v6): New patterns.
	(thumb_zero_extendqisi2_v6, arm_zero_extendqisi2_v6): New patterns.
	(thumb_extendhisi2_insn_v6, arm_extendhisi2_v6): New patterns.
	(thumb_extendqisi2_v6, arm_extendqisi_v6): New patterns.
	(arm_zero_extendhisi2_reg, arm_zero_extendqisi2_reg): Delete.
	(arm_extendhisi2_reg, arm_extendqisi2_reg): Delete.
	(arm_zero_extendhisi2addsi): Remove subreg.  Add attributes.
	(arm_zero_extendqisi2addsi, arm_extendhisi2addsi): Likewise.
	(arm_extendqisi2addsi): Likewise.

	2003-12-31  Mark Mitchell  <mark@codesourcery.com>

	Revert this change:
	* config/arm/arm.h (THUMB_LEGTITIMIZE_RELOAD_ADDRESS): Reload REG
	+ REG addressing modes.

	* config/arm/arm.h (THUMB_LEGTITIMIZE_RELOAD_ADDRESS): Reload REG
	+ REG addressing modes.

	2003-12-30  Mark Mitchell  <mark@codesourcery.com>

	* config/arm/arm.h (THUMB_LEGITIMATE_CONSTANT_P): Accept
	CONSTANT_P_RTX.

	2003-30-12  Paul Brook  <paul@codesourcery.com>

	* longlong.h: protect arm inlines with !defined (__thumb__)

	2003-30-12  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Always define __arm__.

	2003-12-30  Nathan Sidwell  <nathan@codesourcery.com>

	* builtins.c (expand_builtin_apply_args_1): Fix typo in previous
	change.

	2003-12-29  Nathan Sidwell  <nathan@codesourcery.com>

	* builtins.c (expand_builtin_apply_args_1): Add pretend args size
	to the virtual incoming args pointer for downward stacks.

	2003-12-29  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm-cores.def: Add cost function.
	* config/arm/arm.c (arm_*_rtx_costs): New functions.
	(arm_rtx_costs): Remove
	(struct processors): Add rtx_costs field.
	(all_cores, all_architectures): Ditto.
	(arm_override_options): Set targetm.rtx_costs.
	(thumb_rtx_costs): New function.
	(arm_rtx_costs_1): Remove cases handled elsewhere.
	* config/arm/arm.h (processor_type): Add COSTS parameter.

	2003-12-29  Nathan Sidwell  <nathan@codesourcery.com>

	* config/arm/arm.md (generic_sched): arm926 has its own scheduler.
	(arm926ejs.md): Include it.
	* config/arm/arm926ejs.md: New pipeline description.

	2003-12-24  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm.c (arm_arch6j): New variable.
	(arm_override_options): Set it.
	(arm_emit_extendsi): New function.
	* config/arm/arm-protos.h (arm_emit_extendsi): Add prototype.
	* config/arm/arm.h (arm_arch6j): Declare.
	* config/arm/arm.md: Add sign/zero extend insns.

	2003-12-23  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm.c (all_architectures): Add armv6.
	* doc/invoke.texi: Document it.

	2003-12-19  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm.md: Add load1 and load_byte "type" attrs.  Modify
	insn patterns to match.
	* config/arm/arm-generic.md: Ditto.
	* config/arm/cirrus.md: Ditto.
	* config/arm/fpa.md: Ditto.
	* config/amm/iwmmxt.md: Ditto.
	* config/arm/arm1026ejs.md: Ditto.
	* config/arm/arm1135jfs.md: Ditto.  Add insn_reservation and bypasses
	for 11_loadb.

	2003-12-18  Nathan Sidwell  <nathan@codesourcery.com>

	* config/arm/arm-protos.h (arm_no_early_alu_shift_value_dep): Declare.
	* config/arm/arm.c (arm_adjust_cost): Check shift cost for
	TYPE_ALU_SHIFT and TYPE_ALU_SHIFT_REG.
	(arm_no_early_store_addr_dep, arm_no_early_alu_shift_dep,
	arm_no_early_mul_dep): Correctly deal with conditional execution,
	parallels and single shift operations.
	(arm_no_early_alu_shift_value_dep): Define.
	* arm.md (attr type): Replace 'normal' with 'alu',
	'alu_shift' and 'alu_shift_reg'.
	(attr core_cycles): Adjust.
	(*addsi3_carryin_shift, andsi_not_shiftsi_si, *arm_shiftsi3,
	*shiftsi3_compare0, *notsi_shiftsi, *notsi_shiftsi_compare0,
	*not_shiftsi_compare0_scratch, *cmpsi_shiftsi, *cmpsi_shiftsi_swp,
	*cmpsi_neg_shiftsi, *arith_shiftsi, *arith_shiftsi_compare0,
	*arith_shiftsi_compare0_scratch, *sub_shiftsi,
	*sub_shiftsi_compare0, *sub_shiftsi_compare0_scratch,
	*if_shift_move, *if_move_shift, *if_shift_shift): Set type
	attribute appropriately.
	* config/arm/arm1026ejs.md (alu_op): Adjust.
	(alu_shift_op, alu_shift_reg_op): New.
	* config/arm/arm1136.md: Add better bypasses for early
	registers. Remove load[234] and store[234] bypasses.
	(11_alu_op): Adjust.
	(11_alu_shift_op, 11_alu_shift_reg_op): New.

	2003-12-15  Nathan Sidwell  <nathan@codesourcery.com>

	* config/arm/arm-protos.h (arm_no_early_store_addr_dep,
	arm_no_early_alu_shift_dep, arm_no_early_mul_dep): Declare.
	* config/arm/arm.c (arm_no_early_store_addr_dep,
	arm_no_early_alu_shift_dep, arm_no_early_mul_dep): Define.
	* config/arm/arm1026ejs.md: Add load-store bypass.
	* config/arm/arm1136jfs.md (11_alu_op): Take 2 cycles.
	Add bypasses between instructions.

	2003-12-10  Paul Brook  <paul@codesourcery.com>

	* config/arm/arm.c (arm_fpu_model): New variable.
	(arm_fload_abi): New variable.
	(target_fpe_name): Rename from target_fp_name.
	(target_fpu_name): New variable.
	(arm_is_cirrus): Remove.
	(fpu_desc): New struct.
	(all_fpus): Define.
	(pf_model_for_fpu): Define.
	(all_loat_abis): Define.
	(arm_override_options): Set fp arch flags based on -mfpu=
	and -float-abi=.
	(FIRST_FPA_REGNUM): Rename from FIRST_ARM_FP_REGNUM.
	(LAST_FPA_REGNUM): Rename from LAST_ARM_FP_REGNUM.
	(*): Use new TARGET_* flags.
	* config/arm/arm.h (TARGET_ANY_HARD_FLOAT): Remove.
	(TARGET_HARD_FLOAT): No longer implies TARGET_FPA.
	(TARGET_SOFT_FLOAT): Ditto.
	(TARGET_SOFT_FLOAT_ABI): New.
	(TARGET_MAVERICK): Rename from TARGET_CIRRUS.  No longer implies
	TARGET_HARD_FLOAT.
	(TARGET_VFP): No longer implies TARGET_HARD_FLOAT.
	(TARGET_OPTIONS): Add -mfpu=.
	(FIRST_FPA_REGNUM): Rename from FIRST_ARM_FP_REGNUM.
	(LAST_FPA_REGNUM): Rename from LAST_ARM_FP_REGNUM.
	(arm_pf_model): Define.
	(arm_float_abi_type): Define.
	(fputype): Add FPUTYPE_VFP.  Change SOFT_FPA->NONE
	* config/arm/arm.md: Use new TARGET_* flags.
	* config/arm/cirrus.md: Ditto.
	* config/arm/fpa.md: Ditto.
	* config/arm/elf.h (ASM_SPEC): Pass -mfloat-abi= and -mfpu=.
	* config/arm/semi.h (ASM_SPEC): Ditto.
	* config/arm/netbsd-elf.h (SUBTARGET_ASM_FLOAT_SPEC): Specify vfp.
	(FPUTYPE_DEFAULT): Set to VFP.
	* doc/invoke.texi: Document -mfpu= and -mfloat-abi=.

	2003-11-22  Phil Edwards  <phil@codesourcery.com>

	PR target/12476
	* config/arm/arm.c (arm_output_mi_thunk):  In Thumb mode, use
	'bx' instead of 'b' to avoid branch range restrictions.  Output
	the thunk immediately before the thunked-to function.
	* config/arm/arm.h (ARM_DECLARE_FUNCTION_NAME):  Do not emit
	.thumb_func if a thunk is being generated.  Emit .code 16 along
	with .thumb_func if a thunk is not being generated.

	2003-11-15  Nicolas Pitre <nico@cam.org>

	* config/arm/arm.md (ashldi3, arm_ashldi3_1bit, ashrdi3,
	arm_ashrdi3_1bit, lshrdi3, arm_lshrdi3_1bit): New patterns.
	* config/arm/iwmmxt.md (ashrdi3_iwmmxt): Renamed from ashrdi3.
	(lshrdi3_iwmmxt): Renamed from lshrdi3.
	* config/arm/arm.c (IWMMXT_BUILTIN2): Renamed argument accordingly.

	2003-11-12  Steve Woodford  <scw@wasabisystems.com>
	    Ian Lance Taylor  <ian@wasabisystems.com>

	* config/arm/lib1funcs.asm (ARM_DIV_BODY, ARM_MOD_BODY): Add new
	code for __ARM_ARCH__ >= 5 && ! defined (__OPTIMIZE_SIZE__).

	2003-11-05  Phil Edwards  <phil@codesourcery.com>

	* config/arm/arm.md (insn):  Add new V6 instruction names.
	(generic_sched):  New attr.
	* config/arm/arm-generic.md:  Use generic_sched here.
	* config/arm/arm1026ejs.md:  Do not model fetch/issue/decode
	stages of pipeline.  Adjust latency counts accordingly.
	* config/arm/arm1136jfs.md:  New file.

	2003-10-28  Mark Mitchell  <mark@codesourcery.com>

	* config/arm/arm.h (processor_type): New enumeration type.
	(CPP_ARCH_DEFAULT_SPEC): Set appropriately for ARM 926EJ-S,
	ARM1026EJ-S, ARM1136J-S, and ARM1136JF-S processor cores.
	(CPP_CPU_ARCH_SPEC): Likewise.
	* config/arm/arm.c (arm_tune): New variable.
	(all_cores): Use cores.def.
	(all_architectures): Add representative processor.
	(arm_override_options): Restructure way in which tuning
	information is deduced.
	* arm.md: Update "insn" and "type" attributes throughout.
	(insn): New attribute.
	(type): Compute "mult" from "insn" attribute.  Add load2,
	load3, load4 alternatives.
	(arm automaton): Move to arm-generic.md.
	* config/arm/arm-cores.def: New file.
	* config/arm/arm-generic.md: Likewise.
	* config/arm/arm1026ejs.md: Likewise.


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@77171 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog                 |  329 +++++++++
 gcc/config.gcc                |   19 +-
 gcc/config/arm/aof.h          |   23 +-
 gcc/config/arm/aout.h         |   27 +-
 gcc/config/arm/arm-cores.def  |   87 +++
 gcc/config/arm/arm-generic.md |  152 ++++
 gcc/config/arm/arm-protos.h   |   19 +-
 gcc/config/arm/arm.c          | 1625 +++++++++++++++++++++++++++++++----------
 gcc/config/arm/arm.h          |  370 +++++++---
 gcc/config/arm/arm.md         | 1279 ++++++++++++++++++++------------
 gcc/config/arm/arm1026ejs.md  |  241 ++++++
 gcc/config/arm/arm1136jfs.md  |  377 ++++++++++
 gcc/config/arm/arm926ejs.md   |  188 +++++
 gcc/config/arm/cirrus.md      |   88 +--
 gcc/config/arm/elf.h          |    4 +-
 gcc/config/arm/fpa.md         |  172 ++---
 gcc/config/arm/iwmmxt.md      |   18 +-
 gcc/config/arm/lib1funcs.asm  |   48 +-
 gcc/config/arm/linux-elf.h    |    2 +-
 gcc/config/arm/netbsd-elf.h   |   13 +-
 gcc/config/arm/semi.h         |    3 +-
 gcc/config/arm/vfp.md         |  744 +++++++++++++++++++
 gcc/doc/install.texi          |    9 +-
 gcc/doc/invoke.texi           |   31 +-
 gcc/doc/md.texi               |    6 +
 gcc/longlong.h                |    2 +-
 26 files changed, 4737 insertions(+), 1139 deletions(-)
 create mode 100644 gcc/config/arm/arm-cores.def
 create mode 100644 gcc/config/arm/arm-generic.md
 create mode 100644 gcc/config/arm/arm1026ejs.md
 create mode 100644 gcc/config/arm/arm1136jfs.md
 create mode 100644 gcc/config/arm/arm926ejs.md
 create mode 100644 gcc/config/arm/vfp.md

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 934168003ed..fedcaadeaa0 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,332 @@
+2004-02-02  Paul Brook  <paul@codesourcery.com>
+
+	Merge from csl-arm-branch.
+
+	2004-01-30  Paul Brook  <paul@codesourcery.com>
+
+	* aof.h (REGISTER_NAMES): Add vfp reg names
+	(ADDITIONAL_REGISTER_NAMES): Ditto.
+	* aout.h (REGISTER_NAMES): Ditto.
+	(ADDITIONAL_REGISTER_NAMES): Ditto.
+	* arm-protos.h: Update/Add Prototypes.
+	* arm.c (init_fp_table): Rename from init_fpa_table. Update users.
+	Only allow 0.0 for VFP.
+	(fp_consts_inited): Rename from fpa_consts_inited.  Update users.
+	(values_fp): Rename from values_fpa.  Update Users.
+	(arm_const_double_rtx): Rename from const_double_rtx_ok_for_fpa.
+	Update users.  Only check valid constants for this hardware.
+	(arm_float_rhs_operand): Rename from fpa_rhs_operand.  Update Users.
+	Only allow consts for FPA.
+	(arm_float_add_operand): Rename from fpa_add_operand.  Update users.
+	Only allow consts for FPA.
+	(use_return_insn): Check for saved VFP regs.
+	(arm_legitimate_address_p): Handle VFP DFmode addressing.
+	(arm_legitimize_address): Ditto.
+	(arm_general_register_operand): New function.
+	(vfp_mem_operand): New function.
+	(vfp_compare_operand): New function.
+	(vfp_secondary_reload_class): New function.
+	(arm_float_compare_operand): New function.
+	(vfp_print_multi): New function.
+	(vfp_output_fstmx): New function.
+	(vfp_emit_fstm): New function.
+	(arm_output_epilogue): Output VPF reg restore code.
+	(arm_expand_prologue): Output VFP reg save code.
+	(arm_print_operand): Add 'P'.
+	(arm_hard_regno_mode_ok): Return modes for VFP regs.
+	(arm_regno_class): Return classes for VFP regs.
+	(arm_compute_initial_elimination_offset): Include space for VFP regs.
+	(arm_get_frame_size): Ditto.
+	* arm.h (FIXED_REGISTERS): Add VFP regs.
+	(CALL_USED_REGISTERS): Ditto.
+	(CONDITIONAL_REGISTER_USAGE): Enable VFP regs.
+	(FIRST_VFP_REGNUM): Define.
+	(LAST_VFP_REGNUM): Define.
+	(IS_VFP_REGNUM): Define.
+	(FIRST_PSEUDO_REGISTER): Include VFP regs.
+	(HARD_REGNO_NREGS): Handle VFP regs.
+	(REG_ALLOC_ORDER): Add VFP regs.
+	(enum reg_class): Add VFP_REGS.
+	(REG_CLASS_NAMES): Ditto.
+	(REG_CLASS_CONTENTS): Ditto.
+	(CANNOT_CHANGE_MODE_CLASS) Handle VFP Regs.
+	(REG_CLASS_FROM_LETTER): Add 'w'.
+	(EXTRA_CONSTRAINT_ARM): Add 'U'.
+	(EXTRA_MEMORY_CONSTRAINT): Define.
+	(SECONDARY_OUTPUT_RELOAD_CLASS): Handle VFP regs.
+	(SECONDARY_INPUT_RELOAD_CLASS): Ditto.
+	(REGISTER_MOVE_COST): Ditto.
+	(PREDICATE_CODES): Add arm_general_register_operand,
+	arm_float_compare_operand and vfp_compare_operand.
+	* arm.md (various): Rename as above.
+	(divsf3): Enable when TARGET_VFP.
+	(divdf3): Ditto.
+	(movdfcc): Ditto.
+	(sqrtsf2): Ditto.
+	(sqrtdf2): Ditto.
+	(arm_movdi): Disable when TARGET_VFP.
+	(arm_movsi_insn): Ditto.
+	(movsi): Only split with general regs.
+	(cmpsf): Use arm_float_compare_operand.
+	(push_fp_multi): Restrict to TARGET_FPA.
+	(vfp.md): Include.
+	* vfp.md: New file.
+	* fpa.md (various): Rename as above.
+	* doc/md.texi: Document ARM w and U constraints.
+
+	2004-01-15  Paul Brook  <paul@codesourcery.com>
+
+	* config.gcc: Add with_fpu.  Allow with-float=softfp.
+	* config/arm/arm.c (arm_override_options): Rename *-s to *s.
+	Break out of loop when we find a float-abi.  Fix typo.
+	* config/arm/arm.h (OPTION_DEFAULT_SPECS): Add "fpu".
+	Set -mfloat-abi=.
+	* doc/install.texi: Document --with-fpu.
+
+	2003-01-14  Paul Brook  <paul@codesourcery.com>
+
+	* config.gcc (with_arch): Add armv6.
+	* config/arm/arm.h: Rename TARGET_CPU_*_s to TARGET_CPU_*s.
+	* config/arm/arm.c (arm_overrride_options): Ditto.
+
+	2004-01-08  Richard Earnshaw  <rearnsha@arm.com>
+
+	* arm.c (FL_ARCH3M): Renamed from FL_FAST_MULT.
+	(FL_ARCH6): Renamed from FL_ARCH6J.
+	(arm_arch3m): Renamed from arm_fast_multiply.
+	(arm_arch6): Renamed from arm_arch6j.
+	* arm.h: Update all uses of above.
+	* arm-cores.def: Likewise.
+	* arm.md: Likewise.
+
+	* arm.h (CPP_CPU_ARCH_SPEC): Emit __ARM_ARCH_6J__ define for armV6j,
+	not arm6j.  Add entry for arch armv6.
+
+	2004-01-07  Richard Earnshaw  <rearnsha@arm.com>
+
+	* arm.c (arm_emit_extendsi): Delete.
+	* arm-protos.h (arm_emit_extendsi): Delete.
+	* arm.md (zero_extendhisi2): Also handle zero-extension of
+	non-subregs.
+	(zero_extendqisi2, extendhisi2, extendqisi2): Likewise.
+	(thumb_zero_extendhisi2): Only match if not v6.
+	(arm_zero_extendhisi2, thumb_zero_extendqisi2, arm_zero_extendqisi2)
+	(thumb_extendhisi2, arm_extendhisi2, arm_extendqisi)
+	(thumb_extendqisi2): Likewise.
+	(thumb_zero_extendhisi2_v6, arm_zero_extendhisi2_v6): New patterns.
+	(thumb_zero_extendqisi2_v6, arm_zero_extendqisi2_v6): New patterns.
+	(thumb_extendhisi2_insn_v6, arm_extendhisi2_v6): New patterns.
+	(thumb_extendqisi2_v6, arm_extendqisi_v6): New patterns.
+	(arm_zero_extendhisi2_reg, arm_zero_extendqisi2_reg): Delete.
+	(arm_extendhisi2_reg, arm_extendqisi2_reg): Delete.
+	(arm_zero_extendhisi2addsi): Remove subreg.  Add attributes.
+	(arm_zero_extendqisi2addsi, arm_extendhisi2addsi): Likewise.
+	(arm_extendqisi2addsi): Likewise.
+
+	2003-12-31  Mark Mitchell  <mark@codesourcery.com>
+
+	Revert this change:
+	* config/arm/arm.h (THUMB_LEGTITIMIZE_RELOAD_ADDRESS): Reload REG
+	+ REG addressing modes.
+
+	* config/arm/arm.h (THUMB_LEGTITIMIZE_RELOAD_ADDRESS): Reload REG
+	+ REG addressing modes.
+
+	2003-12-30  Mark Mitchell  <mark@codesourcery.com>
+
+	* config/arm/arm.h (THUMB_LEGITIMATE_CONSTANT_P): Accept
+	CONSTANT_P_RTX.
+
+	2003-30-12  Paul Brook  <paul@codesourcery.com>
+
+	* longlong.h: protect arm inlines with !defined (__thumb__)
+
+	2003-30-12  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Always define __arm__.
+
+	2003-12-30  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* builtins.c (expand_builtin_apply_args_1): Fix typo in previous
+	change.
+
+	2003-12-29  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* builtins.c (expand_builtin_apply_args_1): Add pretend args size
+	to the virtual incoming args pointer for downward stacks.
+
+	2003-12-29  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm-cores.def: Add cost function.
+	* config/arm/arm.c (arm_*_rtx_costs): New functions.
+	(arm_rtx_costs): Remove
+	(struct processors): Add rtx_costs field.
+	(all_cores, all_architectures): Ditto.
+	(arm_override_options): Set targetm.rtx_costs.
+	(thumb_rtx_costs): New function.
+	(arm_rtx_costs_1): Remove cases handled elsewhere.
+	* config/arm/arm.h (processor_type): Add COSTS parameter.
+
+	2003-12-29  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* config/arm/arm.md (generic_sched): arm926 has its own scheduler.
+	(arm926ejs.md): Include it.
+	* config/arm/arm926ejs.md: New pipeline description.
+
+	2003-12-24  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.c (arm_arch6j): New variable.
+	(arm_override_options): Set it.
+	(arm_emit_extendsi): New function.
+	* config/arm/arm-protos.h (arm_emit_extendsi): Add prototype.
+	* config/arm/arm.h (arm_arch6j): Declare.
+	* config/arm/arm.md: Add sign/zero extend insns.
+
+	2003-12-23  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.c (all_architectures): Add armv6.
+	* doc/invoke.texi: Document it.
+
+	2003-12-19  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.md: Add load1 and load_byte "type" attrs.  Modify
+	insn patterns to match.
+	* config/arm/arm-generic.md: Ditto.
+	* config/arm/cirrus.md: Ditto.
+	* config/arm/fpa.md: Ditto.
+	* config/amm/iwmmxt.md: Ditto.
+	* config/arm/arm1026ejs.md: Ditto.
+	* config/arm/arm1135jfs.md: Ditto.  Add insn_reservation and bypasses
+	for 11_loadb.
+
+	2003-12-18  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* config/arm/arm-protos.h (arm_no_early_alu_shift_value_dep): Declare.
+	* config/arm/arm.c (arm_adjust_cost): Check shift cost for
+	TYPE_ALU_SHIFT and TYPE_ALU_SHIFT_REG.
+	(arm_no_early_store_addr_dep, arm_no_early_alu_shift_dep,
+	arm_no_early_mul_dep): Correctly deal with conditional execution,
+	parallels and single shift operations.
+	(arm_no_early_alu_shift_value_dep): Define.
+	* arm.md (attr type): Replace 'normal' with 'alu',
+	'alu_shift' and 'alu_shift_reg'.
+	(attr core_cycles): Adjust.
+	(*addsi3_carryin_shift, andsi_not_shiftsi_si, *arm_shiftsi3,
+	*shiftsi3_compare0, *notsi_shiftsi, *notsi_shiftsi_compare0,
+	*not_shiftsi_compare0_scratch, *cmpsi_shiftsi, *cmpsi_shiftsi_swp,
+	*cmpsi_neg_shiftsi, *arith_shiftsi, *arith_shiftsi_compare0,
+	*arith_shiftsi_compare0_scratch, *sub_shiftsi,
+	*sub_shiftsi_compare0, *sub_shiftsi_compare0_scratch,
+	*if_shift_move, *if_move_shift, *if_shift_shift): Set type
+	attribute appropriately.
+	* config/arm/arm1026ejs.md (alu_op): Adjust.
+	(alu_shift_op, alu_shift_reg_op): New.
+	* config/arm/arm1136.md: Add better bypasses for early
+	registers. Remove load[234] and store[234] bypasses.
+	(11_alu_op): Adjust.
+	(11_alu_shift_op, 11_alu_shift_reg_op): New.
+
+	2003-12-15  Nathan Sidwell  <nathan@codesourcery.com>
+
+	* config/arm/arm-protos.h (arm_no_early_store_addr_dep,
+	arm_no_early_alu_shift_dep, arm_no_early_mul_dep): Declare.
+	* config/arm/arm.c (arm_no_early_store_addr_dep,
+	arm_no_early_alu_shift_dep, arm_no_early_mul_dep): Define.
+	* config/arm/arm1026ejs.md: Add load-store bypass.
+	* config/arm/arm1136jfs.md (11_alu_op): Take 2 cycles.
+	Add bypasses between instructions.
+
+	2003-12-10  Paul Brook  <paul@codesourcery.com>
+
+	* config/arm/arm.c (arm_fpu_model): New variable.
+	(arm_fload_abi): New variable.
+	(target_fpe_name): Rename from target_fp_name.
+	(target_fpu_name): New variable.
+	(arm_is_cirrus): Remove.
+	(fpu_desc): New struct.
+	(all_fpus): Define.
+	(pf_model_for_fpu): Define.
+	(all_loat_abis): Define.
+	(arm_override_options): Set fp arch flags based on -mfpu=
+	and -float-abi=.
+	(FIRST_FPA_REGNUM): Rename from FIRST_ARM_FP_REGNUM.
+	(LAST_FPA_REGNUM): Rename from LAST_ARM_FP_REGNUM.
+	(*): Use new TARGET_* flags.
+	* config/arm/arm.h (TARGET_ANY_HARD_FLOAT): Remove.
+	(TARGET_HARD_FLOAT): No longer implies TARGET_FPA.
+	(TARGET_SOFT_FLOAT): Ditto.
+	(TARGET_SOFT_FLOAT_ABI): New.
+	(TARGET_MAVERICK): Rename from TARGET_CIRRUS.  No longer implies
+	TARGET_HARD_FLOAT.
+	(TARGET_VFP): No longer implies TARGET_HARD_FLOAT.
+	(TARGET_OPTIONS): Add -mfpu=.
+	(FIRST_FPA_REGNUM): Rename from FIRST_ARM_FP_REGNUM.
+	(LAST_FPA_REGNUM): Rename from LAST_ARM_FP_REGNUM.
+	(arm_pf_model): Define.
+	(arm_float_abi_type): Define.
+	(fputype): Add FPUTYPE_VFP.  Change SOFT_FPA->NONE
+	* config/arm/arm.md: Use new TARGET_* flags.
+	* config/arm/cirrus.md: Ditto.
+	* config/arm/fpa.md: Ditto.
+	* config/arm/elf.h (ASM_SPEC): Pass -mfloat-abi= and -mfpu=.
+	* config/arm/semi.h (ASM_SPEC): Ditto.
+	* config/arm/netbsd-elf.h (SUBTARGET_ASM_FLOAT_SPEC): Specify vfp.
+	(FPUTYPE_DEFAULT): Set to VFP.
+	* doc/invoke.texi: Document -mfpu= and -mfloat-abi=.
+
+	2003-11-22  Phil Edwards  <phil@codesourcery.com>
+
+	PR target/12476
+	* config/arm/arm.c (arm_output_mi_thunk):  In Thumb mode, use
+	'bx' instead of 'b' to avoid branch range restrictions.  Output
+	the thunk immediately before the thunked-to function.
+	* config/arm/arm.h (ARM_DECLARE_FUNCTION_NAME):  Do not emit
+	.thumb_func if a thunk is being generated.  Emit .code 16 along
+	with .thumb_func if a thunk is not being generated.
+
+	2003-11-15  Nicolas Pitre <nico@cam.org>
+
+	* config/arm/arm.md (ashldi3, arm_ashldi3_1bit, ashrdi3,
+	arm_ashrdi3_1bit, lshrdi3, arm_lshrdi3_1bit): New patterns.
+	* config/arm/iwmmxt.md (ashrdi3_iwmmxt): Renamed from ashrdi3.
+	(lshrdi3_iwmmxt): Renamed from lshrdi3.
+	* config/arm/arm.c (IWMMXT_BUILTIN2): Renamed argument accordingly.
+
+	2003-11-12  Steve Woodford  <scw@wasabisystems.com>
+	    Ian Lance Taylor  <ian@wasabisystems.com>
+
+	* config/arm/lib1funcs.asm (ARM_DIV_BODY, ARM_MOD_BODY): Add new
+	code for __ARM_ARCH__ >= 5 && ! defined (__OPTIMIZE_SIZE__).
+
+	2003-11-05  Phil Edwards  <phil@codesourcery.com>
+
+	* config/arm/arm.md (insn):  Add new V6 instruction names.
+	(generic_sched):  New attr.
+	* config/arm/arm-generic.md:  Use generic_sched here.
+	* config/arm/arm1026ejs.md:  Do not model fetch/issue/decode
+	stages of pipeline.  Adjust latency counts accordingly.
+	* config/arm/arm1136jfs.md:  New file.
+
+	2003-10-28  Mark Mitchell  <mark@codesourcery.com>
+
+	* config/arm/arm.h (processor_type): New enumeration type.
+	(CPP_ARCH_DEFAULT_SPEC): Set appropriately for ARM 926EJ-S,
+	ARM1026EJ-S, ARM1136J-S, and ARM1136JF-S processor cores.
+	(CPP_CPU_ARCH_SPEC): Likewise.
+	* config/arm/arm.c (arm_tune): New variable.
+	(all_cores): Use cores.def.
+	(all_architectures): Add representative processor.
+	(arm_override_options): Restructure way in which tuning
+	information is deduced.
+	* arm.md: Update "insn" and "type" attributes throughout.
+	(insn): New attribute.
+	(type): Compute "mult" from "insn" attribute.  Add load2,
+	load3, load4 alternatives.
+	(arm automaton): Move to arm-generic.md.
+	* config/arm/arm-cores.def: New file.
+	* config/arm/arm-generic.md: Likewise.
+	* config/arm/arm1026ejs.md: Likewise.
+
 2004-02-03  Eric Botcazou  <ebotcazou@libertysurf.fr>
 
 	* doc/invoke.texi (SPARC options): Remove -mflat and
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 16d53260e2f..eb9c3ac22a2 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2399,7 +2399,7 @@ fi
 		;;
 
 	arm*-*-*)
-		supported_defaults="arch cpu float tune"
+		supported_defaults="arch cpu float tune fpu"
 		for which in cpu tune; do
 			eval "val=\$with_$which"
 			case "$val" in
@@ -2426,7 +2426,7 @@ fi
 
 		case "$with_arch" in
 		"" \
-		| armv[2345] | armv2a | armv3m | armv4t | armv5t \
+		| armv[23456] | armv2a | armv3m | armv4t | armv5t \
 		| armv5te | armv6j | ep9312)
 			# OK
 			;;
@@ -2438,7 +2438,7 @@ fi
 
 		case "$with_float" in
 		"" \
-		| soft | hard)
+		| soft | hard | softfp)
 			# OK
 			;;
 		*)
@@ -2447,6 +2447,17 @@ fi
 			;;
 		esac
 
+		case "$with_fpu" in
+		"" \
+		| fpa | fpe2 | fpe3 | maverick | vfp )
+			# OK
+			;;
+		*)
+			echo "Unknown fpu used in --with-fpu=$fpu" 2>&1
+			exit 1
+			;;
+		esac
+
 		if test "x$with_arch" != x && test "x$with_cpu" != x; then
 			echo "Warning: --with-arch overrides --with-cpu" 1>&2
 		fi
@@ -2737,7 +2748,7 @@ fi
 	esac
 
 	t=
-	all_defaults="abi cpu arch tune schedule float mode"
+	all_defaults="abi cpu arch tune schedule float mode fpu"
 	for option in $all_defaults
 	do
 		eval "val=\$with_$option"
diff --git a/gcc/config/arm/aof.h b/gcc/config/arm/aof.h
index 5a6ab2c0e2c..8f058e26406 100644
--- a/gcc/config/arm/aof.h
+++ b/gcc/config/arm/aof.h
@@ -246,7 +246,12 @@ do {					\
   "wr0",   "wr1",   "wr2",   "wr3",		\
   "wr4",   "wr5",   "wr6",   "wr7",		\
   "wr8",   "wr9",   "wr10",  "wr11",		\
-  "wr12",  "wr13",  "wr14",  "wr15"		\
+  "wr12",  "wr13",  "wr14",  "wr15",		\
+  "s0",  "s1",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",  \
+  "s8",  "s9",  "s10", "s11", "s12", "s13", "s14", "s15", \
+  "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", \
+  "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31",  \
+  "vfpcc"
 }
 
 #define ADDITIONAL_REGISTER_NAMES		\
@@ -267,6 +272,22 @@ do {					\
   {"r13", 13}, {"sp", 13}, 			\
   {"r14", 14}, {"lr", 14},			\
   {"r15", 15}, {"pc", 15}			\
+  {"d0", 63},					\
+  {"d1", 65},					\
+  {"d2", 67},					\
+  {"d3", 69},					\
+  {"d4", 71},					\
+  {"d5", 73},					\
+  {"d6", 75},					\
+  {"d7", 77},					\
+  {"d8", 79},					\
+  {"d9", 81},					\
+  {"d10", 83},					\
+  {"d11", 85},					\
+  {"d12", 87},					\
+  {"d13", 89},					\
+  {"d14", 91},					\
+  {"d15", 93},					\
 }
 
 #define REGISTER_PREFIX "__"
diff --git a/gcc/config/arm/aout.h b/gcc/config/arm/aout.h
index 1f060fafc7b..e18a1158484 100644
--- a/gcc/config/arm/aout.h
+++ b/gcc/config/arm/aout.h
@@ -49,7 +49,7 @@
 
 /* The assembler's names for the registers.  */
 #ifndef REGISTER_NAMES
-#define REGISTER_NAMES  			   \
+#define REGISTER_NAMES				   \
 {				                   \
   "r0", "r1", "r2", "r3", "r4", "r5", "r6", "r7",  \
   "r8", "r9", "sl", "fp", "ip", "sp", "lr", "pc",  \
@@ -63,7 +63,12 @@
   "wr0",   "wr1",   "wr2",   "wr3",		   \
   "wr4",   "wr5",   "wr6",   "wr7",		   \
   "wr8",   "wr9",   "wr10",  "wr11",		   \
-  "wr12",  "wr13",  "wr14",  "wr15"		   \
+  "wr12",  "wr13",  "wr14",  "wr15",		   \
+  "s0",  "s1",  "s2",  "s3",  "s4",  "s5",  "s6",  "s7",  \
+  "s8",  "s9",  "s10", "s11", "s12", "s13", "s14", "s15", \
+  "s16", "s17", "s18", "s19", "s20", "s21", "s22", "s23", \
+  "s24", "s25", "s26", "s27", "s28", "s29", "s30", "s31", \
+  "vfpcc"					   \
 }
 #endif
 
@@ -152,7 +157,23 @@
   {"mvdx12", 39},				\
   {"mvdx13", 40},				\
   {"mvdx14", 41},				\
-  {"mvdx15", 42}				\
+  {"mvdx15", 42},				\
+  {"d0", 63},					\
+  {"d1", 65},					\
+  {"d2", 67},					\
+  {"d3", 69},					\
+  {"d4", 71},					\
+  {"d5", 73},					\
+  {"d6", 75},					\
+  {"d7", 77},					\
+  {"d8", 79},					\
+  {"d9", 81},					\
+  {"d10", 83},					\
+  {"d11", 85},					\
+  {"d12", 87},					\
+  {"d13", 89},					\
+  {"d14", 91},					\
+  {"d15", 93},					\
 }
 #endif
 
diff --git a/gcc/config/arm/arm-cores.def b/gcc/config/arm/arm-cores.def
new file mode 100644
index 00000000000..038773c2b53
--- /dev/null
+++ b/gcc/config/arm/arm-cores.def
@@ -0,0 +1,87 @@
+/* ARM CPU Cores
+   Copyright (C) 2003 Free Software Foundation, Inc.
+   Written by CodeSourcery, LLC
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 2, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING.  If not, write to the Free
+   Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+   02111-1307, USA.  */
+
+/* Before using #include to read this file, define a macro:
+	
+      ARM_CORE(CORE_NAME, FLAGS)
+
+   The CORE_NAME is the name of the core, represented as an identifier
+   rather than a string constant.  The FLAGS are the bitwise-or of the
+   traits that apply to that core.
+
+   If you update this table, you must update the "tune" attribue in
+   arm.md.  */
+
+ARM_CORE(arm2,		FL_CO_PROC | FL_MODE26, slowmul)
+ARM_CORE(arm250,	FL_CO_PROC | FL_MODE26, slowmul)
+ARM_CORE(arm3,		FL_CO_PROC | FL_MODE26, slowmul)
+ARM_CORE(arm6,		FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm60,		FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm600,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm610,	             FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm620,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm7,		FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+/* arm7m doesn't exist on its own, but only with D, (and I), but
+   those don't alter the code, so arm7m is sometimes used.  */
+ARM_CORE(arm7m,		FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_ARCH3M, fastmul)
+ARM_CORE(arm7d,		FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm7dm,	FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_ARCH3M, fastmul)
+ARM_CORE(arm7di,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm7dmi,	FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_ARCH3M, fastmul)
+ARM_CORE(arm70,		FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm700,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm700i,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm710,	             FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm720,	             FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm710c,	             FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm7100,	             FL_MODE26 | FL_MODE32, slowmul)
+ARM_CORE(arm7500,	             FL_MODE26 | FL_MODE32, slowmul)
+/* Doesn't have an external co-proc, but does have embedded fpa.  */
+ARM_CORE(arm7500fe,	FL_CO_PROC | FL_MODE26 | FL_MODE32, slowmul)
+/* V4 Architecture Processors */
+ARM_CORE(arm7tdmi,	FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB, fastmul)
+ARM_CORE(arm710t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB, fastmul)
+ARM_CORE(arm720t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB, fastmul)
+ARM_CORE(arm740t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB, fastmul)
+ARM_CORE(arm8,	                     FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED, fastmul)
+ARM_CORE(arm810,	             FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED, fastmul)
+ARM_CORE(arm9,	                                 FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED, fastmul)
+ARM_CORE(arm920,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED, fastmul)
+ARM_CORE(arm920t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED, fastmul)
+ARM_CORE(arm940t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED, fastmul)
+ARM_CORE(arm9tdmi,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED, fastmul)
+ARM_CORE(arm9e,	       	      		         FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED, 9e)
+
+ARM_CORE(ep9312,	   			 FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED |             FL_CIRRUS, fastmul)
+ARM_CORE(strongarm,	             FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED | FL_STRONG, fastmul)
+ARM_CORE(strongarm110,               FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED | FL_STRONG, fastmul)
+ARM_CORE(strongarm1100,              FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED | FL_STRONG, fastmul)
+ARM_CORE(strongarm1110,              FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 |            FL_LDSCHED | FL_STRONG, fastmul)
+/* V5 Architecture Processors */
+ARM_CORE(arm10tdmi,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED             | FL_ARCH5, fastmul)
+ARM_CORE(arm1020t,	                         FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED             | FL_ARCH5, fastmul)
+ARM_CORE(arm926ejs,                              FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E, 9e)
+ARM_CORE(arm1026ejs,                             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E, 9e)
+ARM_CORE(xscale,                                 FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE, xscale)
+ARM_CORE(iwmmxt,                                 FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE | FL_IWMMXT, xscale)
+/* V6 Architecture Processors */
+ARM_CORE(arm1136js,                              FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E | FL_ARCH6, 9e)
+ARM_CORE(arm1136jfs,                             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E | FL_ARCH6 | FL_VFPV2, 9e)
diff --git a/gcc/config/arm/arm-generic.md b/gcc/config/arm/arm-generic.md
new file mode 100644
index 00000000000..ec2df47b465
--- /dev/null
+++ b/gcc/config/arm/arm-generic.md
@@ -0,0 +1,152 @@
+;; Generic ARM Pipeline Description
+;; Copyright (C) 2003 Free Software Foundation, Inc.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+;; 02111-1307, USA.  */
+
+(define_automaton "arm")
+
+;; Write buffer
+;
+; Strictly, we should model a 4-deep write buffer for ARM7xx based chips
+;
+; The write buffer on some of the arm6 processors is hard to model exactly.
+; There is room in the buffer for up to two addresses and up to eight words
+; of memory, but the two needn't be split evenly.  When writing the two
+; addresses are fully pipelined.  However, a read from memory that is not
+; currently in the cache will block until the writes have completed.
+; It is normally the case that FCLK and MCLK will be in the ratio 2:1, so
+; writes will take 2 FCLK cycles per word, if FCLK and MCLK are asynchronous
+; (they aren't allowed to be at present) then there is a startup cost of 1MCLK
+; cycle to add as well.
+(define_cpu_unit "write_buf" "arm")
+
+;; Write blockage unit
+;
+; The write_blockage unit models (partially), the fact that reads will stall
+; until the write buffer empties.
+; The f_mem_r and r_mem_f could also block, but they are to the stack,
+; so we don't model them here
+(define_cpu_unit "write_blockage" "arm")
+
+;; Core
+;
+(define_cpu_unit "core" "arm")
+
+(define_insn_reservation "r_mem_f_wbuf" 5
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "yes")
+	    (eq_attr "type" "r_mem_f")))
+  "core+write_buf*3")
+
+(define_insn_reservation "store_wbuf" 5
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "yes")
+       	    (eq_attr "type" "store1")))
+  "core+write_buf*3+write_blockage*5")
+
+(define_insn_reservation "store2_wbuf" 7
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "yes")
+	    (eq_attr "type" "store2")))
+  "core+write_buf*4+write_blockage*7")
+
+(define_insn_reservation "store3_wbuf" 9
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "yes")
+	    (eq_attr "type" "store3")))
+  "core+write_buf*5+write_blockage*9")
+
+(define_insn_reservation "store4_wbuf" 11
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "yes")
+            (eq_attr "type" "store4")))
+  "core+write_buf*6+write_blockage*11")
+
+(define_insn_reservation "store2" 3
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "no")
+            (eq_attr "type" "store2")))
+  "core*3")
+
+(define_insn_reservation "store3" 4
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "no")
+            (eq_attr "type" "store3")))
+  "core*4")
+
+(define_insn_reservation "store4" 5
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "model_wbuf" "no")
+	    (eq_attr "type" "store4")))
+  "core*5")
+
+(define_insn_reservation "store_ldsched" 1
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "yes") 
+	    (eq_attr "type" "store1")))
+  "core")
+
+(define_insn_reservation "load_ldsched_xscale" 3
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "yes") 
+	    (and (eq_attr "type" "load_byte,load1")
+	         (eq_attr "is_xscale" "yes"))))
+  "core")
+
+(define_insn_reservation "load_ldsched" 2
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "yes") 
+	    (and (eq_attr "type" "load_byte,load1")
+	         (eq_attr "is_xscale" "no"))))
+  "core")
+
+(define_insn_reservation "load_or_store" 2
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "!yes") 
+	    (eq_attr "type" "load_byte,load1,load2,load3,load4,store1")))
+  "core*2")
+
+(define_insn_reservation "mult" 16
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "no") (eq_attr "type" "mult")))
+  "core*16")
+
+(define_insn_reservation "mult_ldsched_strongarm" 3
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "yes") 
+	    (and (eq_attr "is_strongarm" "yes")
+	         (eq_attr "type" "mult"))))
+  "core*2")
+
+(define_insn_reservation "mult_ldsched" 4
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "ldsched" "yes") 
+	    (and (eq_attr "is_strongarm" "no")
+	         (eq_attr "type" "mult"))))
+  "core*4")
+
+(define_insn_reservation "multi_cycle" 32
+  (and (eq_attr "generic_sched" "yes")
+       (and (eq_attr "core_cycles" "multi")
+            (eq_attr "type" "!mult,load_byte,load1,load2,load3,load4,store1,store2,store3,store4")))
+  "core*32")
+
+(define_insn_reservation "single_cycle" 1
+  (and (eq_attr "generic_sched" "yes")
+       (eq_attr "core_cycles" "single"))
+  "core")
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 471254efe4e..25f563109ab 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -1,5 +1,6 @@
 /* Prototypes for exported functions defined in arm.c and pe.c
-   Copyright (C) 1999, 2000, 2001, 2002, 2003 Free Software Foundation, Inc.
+   Copyright (C) 1999, 2000, 2001, 2002, 2003, 2004
+   Free Software Foundation, Inc.
    Contributed by Richard Earnshaw (rearnsha@arm.com)
    Minor hacks by Nick Clifton (nickc@cygnus.com)
 
@@ -53,12 +54,14 @@ extern int arm_legitimate_address_p  (enum machine_mode, rtx, int);
 extern int thumb_legitimate_address_p (enum machine_mode, rtx, int);
 extern int thumb_legitimate_offset_p (enum machine_mode, HOST_WIDE_INT);
 extern rtx arm_legitimize_address (rtx, rtx, enum machine_mode);
-extern int const_double_rtx_ok_for_fpa (rtx);
+extern int arm_const_double_rtx (rtx);
 extern int neg_const_double_rtx_ok_for_fpa (rtx);
+extern enum reg_class vfp_secondary_reload_class (enum machine_mode, rtx);
 
 /* Predicates.  */
 extern int s_register_operand (rtx, enum machine_mode);
 extern int arm_hard_register_operand (rtx, enum machine_mode);
+extern int arm_general_register_operand (rtx, enum machine_mode);
 extern int f_register_operand (rtx, enum machine_mode);
 extern int reg_or_int_operand (rtx, enum machine_mode);
 extern int arm_reload_memory_operand (rtx, enum machine_mode);
@@ -70,8 +73,8 @@ extern int arm_not_operand (rtx, enum machine_mode);
 extern int offsettable_memory_operand (rtx, enum machine_mode);
 extern int alignable_memory_operand (rtx, enum machine_mode);
 extern int bad_signed_byte_operand (rtx, enum machine_mode);
-extern int fpa_rhs_operand (rtx, enum machine_mode);
-extern int fpa_add_operand (rtx, enum machine_mode);
+extern int arm_float_rhs_operand (rtx, enum machine_mode);
+extern int arm_float_add_operand (rtx, enum machine_mode);
 extern int power_of_two_operand (rtx, enum machine_mode);
 extern int nonimmediate_di_operand (rtx, enum machine_mode);
 extern int di_operand (rtx, enum machine_mode);
@@ -95,6 +98,13 @@ extern int cirrus_general_operand (rtx, enum machine_mode);
 extern int cirrus_register_operand (rtx, enum machine_mode);
 extern int cirrus_shift_const (rtx, enum machine_mode);
 extern int cirrus_memory_offset (rtx);
+extern int vfp_mem_operand (rtx);
+extern int vfp_compare_operand (rtx, enum machine_mode);
+extern int arm_float_compare_operand (rtx, enum machine_mode);
+extern int arm_no_early_store_addr_dep (rtx, rtx);
+extern int arm_no_early_alu_shift_dep (rtx, rtx);
+extern int arm_no_early_alu_shift_value_dep (rtx, rtx);
+extern int arm_no_early_mul_dep (rtx, rtx);
 
 extern int symbol_mentioned_p (rtx);
 extern int label_mentioned_p (rtx);
@@ -138,6 +148,7 @@ extern int arm_debugger_arg_offset (int, rtx);
 extern int arm_is_longcall_p (rtx, int, int);
 extern int    arm_emit_vector_const (FILE *, rtx);
 extern const char * arm_output_load_gr (rtx *);
+extern const char *vfp_output_fstmx (rtx *);
 
 #if defined TREE_CODE
 extern rtx arm_function_arg (CUMULATIVE_ARGS *, enum machine_mode, tree, int);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 0be10946051..650c74ddb76 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -75,7 +75,6 @@ static bool arm_assemble_integer (rtx, unsigned int, int);
 #endif
 static const char *fp_const_from_val (REAL_VALUE_TYPE *);
 static arm_cc get_arm_condition_code (rtx);
-static void init_fpa_table (void);
 static HOST_WIDE_INT int_log2 (HOST_WIDE_INT);
 static rtx is_jump_table (rtx);
 static const char *output_multi_immediate (rtx *, const char *, const char *,
@@ -124,7 +123,10 @@ static void arm_internal_label (FILE *, const char *, unsigned long);
 static void arm_output_mi_thunk (FILE *, tree, HOST_WIDE_INT, HOST_WIDE_INT,
 				 tree);
 static int arm_rtx_costs_1 (rtx, enum rtx_code, enum rtx_code);
-static bool arm_rtx_costs (rtx, int, int, int *);
+static bool arm_slowmul_rtx_costs (rtx, int, int, int *);
+static bool arm_fastmul_rtx_costs (rtx, int, int, int *);
+static bool arm_xscale_rtx_costs (rtx, int, int, int *);
+static bool arm_9e_rtx_costs (rtx, int, int, int *);
 static int arm_address_cost (rtx);
 static bool arm_memory_load_p (rtx);
 static bool arm_cirrus_insn_p (rtx);
@@ -223,8 +225,9 @@ static void arm_setup_incoming_varargs (CUMULATIVE_ARGS *, enum machine_mode,
 #undef  TARGET_ASM_CAN_OUTPUT_MI_THUNK
 #define TARGET_ASM_CAN_OUTPUT_MI_THUNK default_can_output_mi_thunk_no_vcall
 
+/* This will be overridden in arm_override_options.  */
 #undef  TARGET_RTX_COSTS
-#define TARGET_RTX_COSTS arm_rtx_costs
+#define TARGET_RTX_COSTS arm_slowmul_rtx_costs
 #undef  TARGET_ADDRESS_COST
 #define TARGET_ADDRESS_COST arm_address_cost
 
@@ -266,17 +269,32 @@ int making_const_table;
    stored from the compare operation.  */
 rtx arm_compare_op0, arm_compare_op1;
 
-/* What type of floating point are we tuning for?  */
-enum fputype arm_fpu_tune;
+/* The processor for which instructions should be scheduled.  */
+enum processor_type arm_tune = arm_none;
+
+/* Which floating point model to use.  */
+enum arm_fp_model arm_fp_model;
 
-/* What type of floating point instructions are available?  */
+/* Which floating point hardware is available.  */
 enum fputype arm_fpu_arch;
 
+/* Which floating point hardware to schedule for.  */
+enum fputype arm_fpu_tune;
+
+/* Whether to use floating point hardware.  */
+enum float_abi_type arm_float_abi;
+
 /* What program mode is the cpu running in? 26-bit mode or 32-bit mode.  */
 enum prog_mode_type arm_prgmode;
 
-/* Set by the -mfp=... option.  */
-const char * target_fp_name = NULL;
+/* Set by the -mfpu=... option.  */
+const char * target_fpu_name = NULL;
+
+/* Set by the -mfpe=... option.  */
+const char * target_fpe_name = NULL;
+
+/* Set by the -mfloat-abi=... option.  */
+const char * target_float_abi_name = NULL;
 
 /* Used to parse -mstructure_size_boundary command line option.  */
 const char * structure_size_string = NULL;
@@ -284,7 +302,7 @@ int    arm_structure_size_boundary = DEFAULT_STRUCTURE_SIZE_BOUNDARY;
 
 /* Bit values used to identify processor capabilities.  */
 #define FL_CO_PROC    (1 << 0)        /* Has external co-processor bus */
-#define FL_FAST_MULT  (1 << 1)        /* Fast multiply */
+#define FL_ARCH3M     (1 << 1)        /* Extended multiply */
 #define FL_MODE26     (1 << 2)        /* 26-bit mode support */
 #define FL_MODE32     (1 << 3)        /* 32-bit mode support */
 #define FL_ARCH4      (1 << 4)        /* Architecture rel 4 */
@@ -295,26 +313,25 @@ int    arm_structure_size_boundary = DEFAULT_STRUCTURE_SIZE_BOUNDARY;
 #define FL_ARCH5E     (1 << 9)        /* DSP extensions to v5 */
 #define FL_XSCALE     (1 << 10)	      /* XScale */
 #define FL_CIRRUS     (1 << 11)	      /* Cirrus/DSP.  */
-#define FL_IWMMXT     (1 << 29)	      /* XScale v2 or "Intel Wireless MMX technology".  */
-#define FL_ARCH6J     (1 << 12)       /* Architecture rel 6.  Adds
+#define FL_ARCH6      (1 << 12)       /* Architecture rel 6.  Adds
 					 media instructions.  */
 #define FL_VFPV2      (1 << 13)       /* Vector Floating Point V2.  */
 
+#define FL_IWMMXT     (1 << 29)	      /* XScale v2 or "Intel Wireless MMX technology".  */
+
 /* The bits in this mask specify which
    instructions we are allowed to generate.  */
 static unsigned long insn_flags = 0;
 
 /* The bits in this mask specify which instruction scheduling options should
-   be used.  Note - there is an overlap with the FL_FAST_MULT.  For some
-   hardware we want to be able to generate the multiply instructions, but to
-   tune as if they were not present in the architecture.  */
+   be used.  */
 static unsigned long tune_flags = 0;
 
 /* The following are used in the arm.md file as equivalents to bits
    in the above two flag variables.  */
 
-/* Nonzero if this is an "M" variant of the processor.  */
-int arm_fast_multiply = 0;
+/* Nonzero if this chip supports the ARM Architecture 3M extensions.  */
+int arm_arch3m = 0;
 
 /* Nonzero if this chip supports the ARM Architecture 4 extensions.  */
 int arm_arch4 = 0;
@@ -325,6 +342,9 @@ int arm_arch5 = 0;
 /* Nonzero if this chip supports the ARM Architecture 5E extensions.  */
 int arm_arch5e = 0;
 
+/* Nonzero if this chip supports the ARM Architecture 6 extensions.  */
+int arm_arch6 = 0;
+
 /* Nonzero if this chip can benefit from load scheduling.  */
 int arm_ld_sched = 0;
 
@@ -343,9 +363,6 @@ int arm_tune_xscale = 0;
 /* Nonzero if this chip is an ARM6 or an ARM7.  */
 int arm_is_6_or_7 = 0;
 
-/* Nonzero if this chip is a Cirrus/DSP.  */
-int arm_is_cirrus = 0;
-
 /* Nonzero if generating Thumb instructions.  */
 int thumb_code = 0;
 
@@ -389,7 +406,9 @@ static const char * const arm_condition_codes[] =
 struct processors
 {
   const char *const name;
+  enum processor_type core;
   const unsigned long flags;
+  bool (* rtx_costs) (rtx, int, int, int *);
 };
 
 /* Not all of these give usefully different compilation alternatives,
@@ -397,83 +416,35 @@ struct processors
 static const struct processors all_cores[] =
 {
   /* ARM Cores */
-  
-  {"arm2",	FL_CO_PROC | FL_MODE26 },
-  {"arm250",	FL_CO_PROC | FL_MODE26 },
-  {"arm3",	FL_CO_PROC | FL_MODE26 },
-  {"arm6",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm60",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm600",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm610",	             FL_MODE26 | FL_MODE32 },
-  {"arm620",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm7",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  /* arm7m doesn't exist on its own, but only with D, (and I), but
-     those don't alter the code, so arm7m is sometimes used.  */
-  {"arm7m",	FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_FAST_MULT },
-  {"arm7d",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm7dm",	FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_FAST_MULT },
-  {"arm7di",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm7dmi",	FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_FAST_MULT },
-  {"arm70",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm700",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm700i",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  {"arm710",	             FL_MODE26 | FL_MODE32 },
-  {"arm720",	             FL_MODE26 | FL_MODE32 },
-  {"arm710c",	             FL_MODE26 | FL_MODE32 },
-  {"arm7100",	             FL_MODE26 | FL_MODE32 },
-  {"arm7500",	             FL_MODE26 | FL_MODE32 },
-  /* Doesn't have an external co-proc, but does have embedded fpa.  */
-  {"arm7500fe",	FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  /* V4 Architecture Processors */
-  {"arm7tdmi",	FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB },
-  {"arm710t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB },
-  {"arm720t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB },
-  {"arm740t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB },
-  {"arm8",	             FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED },
-  {"arm810",	             FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED },
-  {"arm9",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED },
-  {"arm920",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED },
-  {"arm920t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED },
-  {"arm940t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED },
-  {"arm9tdmi",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED },
-  {"arm9e",	       	      		 FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED },
-  {"ep9312",	   			 FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED |             FL_CIRRUS },
-  {"strongarm",	             FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED | FL_STRONG },
-  {"strongarm110",           FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED | FL_STRONG },
-  {"strongarm1100",          FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED | FL_STRONG },
-  {"strongarm1110",          FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 |            FL_LDSCHED | FL_STRONG },
-  /* V5 Architecture Processors */
-  {"arm10tdmi",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED             | FL_ARCH5 },
-  {"arm1020t",	                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED             | FL_ARCH5 },
-  {"arm926ejs",                          FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E },
-  {"arm1026ejs",                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E },
-  {"xscale",                             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE },
-  {"iwmmxt",                             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE | FL_IWMMXT },
-  /* V6 Architecture Processors */
-  {"arm1136js",                          FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E | FL_ARCH6J },
-  {"arm1136jfs",                         FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB                          | FL_ARCH5 | FL_ARCH5E | FL_ARCH6J | FL_VFPV2 },
-  {NULL, 0}
+#define ARM_CORE(NAME, FLAGS, COSTS) \
+  {#NAME, arm_none, FLAGS, arm_##COSTS##_rtx_costs},
+#include "arm-cores.def"
+#undef ARM_CORE
+  {NULL, arm_none, 0, NULL}
 };
 
 static const struct processors all_architectures[] =
 {
   /* ARM Architectures */
+  /* We don't specify rtx_costs here as it will be figured out
+     from the core.  */
   
-  { "armv2",     FL_CO_PROC | FL_MODE26 },
-  { "armv2a",    FL_CO_PROC | FL_MODE26 },
-  { "armv3",     FL_CO_PROC | FL_MODE26 | FL_MODE32 },
-  { "armv3m",    FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_FAST_MULT },
-  { "armv4",     FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_FAST_MULT | FL_ARCH4 },
+  { "armv2",     arm2,       FL_CO_PROC | FL_MODE26 , NULL},
+  { "armv2a",    arm2,       FL_CO_PROC | FL_MODE26 , NULL},
+  { "armv3",     arm6,       FL_CO_PROC | FL_MODE26 | FL_MODE32 , NULL},
+  { "armv3m",    arm7m,      FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_ARCH3M , NULL},
+  { "armv4",     arm7tdmi,   FL_CO_PROC | FL_MODE26 | FL_MODE32 | FL_ARCH3M | FL_ARCH4 , NULL},
   /* Strictly, FL_MODE26 is a permitted option for v4t, but there are no
      implementations that support it, so we will leave it out for now.  */
-  { "armv4t",    FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB },
-  { "armv5",     FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_ARCH5 },
-  { "armv5t",    FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_ARCH5 },
-  { "armv5te",   FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_ARCH5 | FL_ARCH5E },
-  { "armv6j",    FL_CO_PROC |             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_ARCH5 | FL_ARCH5E | FL_ARCH6J },
-  { "ep9312",				  FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_LDSCHED | FL_CIRRUS },
-  {"iwmmxt",                             FL_MODE32 | FL_FAST_MULT | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE | FL_IWMMXT },
-  { NULL, 0 }
+  { "armv4t",    arm7tdmi,   FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB , NULL},
+  { "armv5",     arm10tdmi,  FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_ARCH5 , NULL},
+  { "armv5t",    arm10tdmi,  FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_ARCH5 , NULL},
+  { "armv5te",   arm1026ejs, FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_ARCH5 | FL_ARCH5E , NULL},
+  { "armv6",     arm1136js,  FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_ARCH5 | FL_ARCH5E | FL_ARCH6 , NULL},
+  { "armv6j",    arm1136js,  FL_CO_PROC |             FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_ARCH5 | FL_ARCH5E | FL_ARCH6 , NULL},
+  { "ep9312",	 ep9312, 			      FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_LDSCHED | FL_CIRRUS , NULL},
+  {"iwmmxt",     iwmmxt,                              FL_MODE32 | FL_ARCH3M | FL_ARCH4 | FL_THUMB | FL_LDSCHED | FL_STRONG | FL_ARCH5 | FL_ARCH5E | FL_XSCALE | FL_IWMMXT , NULL},
+  { NULL, arm_none, 0 , NULL}
 };
 
 /* This is a magic structure.  The 'string' field is magically filled in
@@ -488,6 +459,57 @@ struct arm_cpu_select arm_select[] =
   { NULL,	"-mtune=",	all_cores }
 };
 
+struct fpu_desc
+{
+  const char * name;
+  enum fputype fpu;
+};
+
+
+/* Available values for for -mfpu=.  */
+
+static const struct fpu_desc all_fpus[] =
+{
+  {"fpa",	FPUTYPE_FPA},
+  {"fpe2",	FPUTYPE_FPA_EMU2},
+  {"fpe3",	FPUTYPE_FPA_EMU2},
+  {"maverick",	FPUTYPE_MAVERICK},
+  {"vfp",	FPUTYPE_VFP}
+};
+
+
+/* Floating point models used by the different hardware.
+   See fputype in arm.h.  */
+
+static const enum fputype fp_model_for_fpu[] =
+{
+  /* No FP hardware.  */
+  ARM_FP_MODEL_UNKNOWN,		/* FPUTYPE_NONE  */
+  ARM_FP_MODEL_FPA,		/* FPUTYPE_FPA  */
+  ARM_FP_MODEL_FPA,		/* FPUTYPE_FPA_EMU2  */
+  ARM_FP_MODEL_FPA,		/* FPUTYPE_FPA_EMU3  */
+  ARM_FP_MODEL_MAVERICK,	/* FPUTYPE_MAVERICK  */
+  ARM_FP_MODEL_VFP		/* FPUTYPE_VFP  */
+};
+
+
+struct float_abi
+{
+  const char * name;
+  enum float_abi_type abi_type;
+};
+
+
+/* Available values for -mfloat-abi=.  */
+
+static const struct float_abi all_float_abis[] =
+{
+  {"soft",	ARM_FLOAT_ABI_SOFT},
+  {"softfp",	ARM_FLOAT_ABI_SOFTFP},
+  {"hard",	ARM_FLOAT_ABI_HARD}
+};
+
+
 /* Return the number of bits set in VALUE.  */
 static unsigned
 bit_count (unsigned long value)
@@ -509,7 +531,7 @@ void
 arm_override_options (void)
 {
   unsigned i;
-  
+
   /* Set up the flags based on the cpu/architecture selected by the user.  */
   for (i = ARRAY_SIZE (arm_select); i--;)
     {
@@ -522,9 +544,19 @@ arm_override_options (void)
           for (sel = ptr->processors; sel->name != NULL; sel++)
             if (streq (ptr->string, sel->name))
               {
-		if (i == 2)
-		  tune_flags = sel->flags;
-		else
+		/* Determine the processor core for which we should
+		   tune code-generation.  */
+		if (/* -mcpu= is a sensible default.  */
+		    i == 0
+		    /* If -march= is used, and -mcpu= has not been used,
+		       assume that we should tune for a representative
+		       CPU from that architecture.  */
+		    || i == 1
+		    /* -mtune= overrides -mcpu= and -march=.  */
+		    || i == 2)
+		  arm_tune = (enum processor_type) (sel - ptr->processors);
+
+		if (i != 2)
 		  {
 		    /* If we have been given an architecture and a processor
 		       make sure that they are compatible.  We only generate
@@ -571,10 +603,10 @@ arm_override_options (void)
 	{ TARGET_CPU_xscale,    "xscale" },
 	{ TARGET_CPU_ep9312,    "ep9312" },
 	{ TARGET_CPU_iwmmxt,    "iwmmxt" },
-	{ TARGET_CPU_arm926ej_s, "arm926ej-s" },
-	{ TARGET_CPU_arm1026ej_s, "arm1026ej-s" },
-	{ TARGET_CPU_arm1136j_s, "arm1136j_s" },
-	{ TARGET_CPU_arm1136jf_s, "arm1136jf_s" },
+	{ TARGET_CPU_arm926ejs, "arm926ejs" },
+	{ TARGET_CPU_arm1026ejs, "arm1026ejs" },
+	{ TARGET_CPU_arm1136js, "arm1136js" },
+	{ TARGET_CPU_arm1136jfs, "arm1136jfs" },
 	{ TARGET_CPU_generic,   "arm" },
 	{ 0, 0 }
       };
@@ -598,7 +630,7 @@ arm_override_options (void)
 	abort ();
 
       insn_flags = sel->flags;
-      
+
       /* Now check to see if the user has specified some command line
 	 switch that require certain abilities from the cpu.  */
       sought = 0;
@@ -668,12 +700,17 @@ arm_override_options (void)
 
 	  insn_flags = sel->flags;
 	}
+      if (arm_tune == arm_none)
+	arm_tune = (enum processor_type) (sel - all_cores);
     }
   
-  /* If tuning has not been specified, tune for whichever processor or
-     architecture has been selected.  */
-  if (tune_flags == 0)
-    tune_flags = insn_flags;
+  /* The processor for which we should tune should now have been
+     chosen.  */
+  if (arm_tune == arm_none)
+    abort ();
+  
+  tune_flags = all_cores[(int)arm_tune].flags;
+  targetm.rtx_costs = all_cores[(int)arm_tune].rtx_costs;
 
   /* Make sure that the processor choice does not conflict with any of the
      other command line choices.  */
@@ -762,66 +799,110 @@ arm_override_options (void)
     warning ("passing floating point arguments in fp regs not yet supported");
   
   /* Initialize boolean versions of the flags, for use in the arm.md file.  */
-  arm_fast_multiply = (insn_flags & FL_FAST_MULT) != 0;
-  arm_arch4         = (insn_flags & FL_ARCH4) != 0;
-  arm_arch5         = (insn_flags & FL_ARCH5) != 0;
-  arm_arch5e        = (insn_flags & FL_ARCH5E) != 0;
-  arm_arch_xscale     = (insn_flags & FL_XSCALE) != 0;
-
-  arm_ld_sched      = (tune_flags & FL_LDSCHED) != 0;
-  arm_is_strong     = (tune_flags & FL_STRONG) != 0;
-  thumb_code	    = (TARGET_ARM == 0);
-  arm_is_6_or_7     = (((tune_flags & (FL_MODE26 | FL_MODE32))
-		       && !(tune_flags & FL_ARCH4))) != 0;
-  arm_tune_xscale       = (tune_flags & FL_XSCALE) != 0;
-  arm_is_cirrus	    = (tune_flags & FL_CIRRUS) != 0;
-  arm_arch_iwmmxt   = (insn_flags & FL_IWMMXT) != 0;
+  arm_arch3m = (insn_flags & FL_ARCH3M) != 0;
+  arm_arch4 = (insn_flags & FL_ARCH4) != 0;
+  arm_arch5 = (insn_flags & FL_ARCH5) != 0;
+  arm_arch5e = (insn_flags & FL_ARCH5E) != 0;
+  arm_arch6 = (insn_flags & FL_ARCH6) != 0;
+  arm_arch_xscale = (insn_flags & FL_XSCALE) != 0;
+
+  arm_ld_sched = (tune_flags & FL_LDSCHED) != 0;
+  arm_is_strong = (tune_flags & FL_STRONG) != 0;
+  thumb_code = (TARGET_ARM == 0);
+  arm_is_6_or_7 = (((tune_flags & (FL_MODE26 | FL_MODE32))
+		    && !(tune_flags & FL_ARCH4))) != 0;
+  arm_tune_xscale = (tune_flags & FL_XSCALE) != 0;
+  arm_arch_iwmmxt = (insn_flags & FL_IWMMXT) != 0;
 
   if (TARGET_IWMMXT && (! TARGET_ATPCS))
     target_flags |= ARM_FLAG_ATPCS;    
 
-  if (arm_is_cirrus)
+  arm_fp_model = ARM_FP_MODEL_UNKNOWN;
+  if (target_fpu_name == NULL && target_fpe_name != NULL)
     {
-      arm_fpu_tune = FPUTYPE_MAVERICK;
-
-      /* Ignore -mhard-float if -mcpu=ep9312.  */
-      if (TARGET_HARD_FLOAT)
-	target_flags ^= ARM_FLAG_SOFT_FLOAT;
+      if (streq (target_fpe_name, "2"))
+	target_fpu_name = "fpe2";
+      else if (streq (target_fpe_name, "3"))
+	target_fpu_name = "fpe3";
+      else
+	error ("invalid floating point emulation option: -mfpe=%s",
+	       target_fpe_name);
+    }
+  if (target_fpu_name != NULL)
+    {
+      /* The user specified a FPU.  */
+      for (i = 0; i < ARRAY_SIZE (all_fpus); i++)
+	{
+	  if (streq (all_fpus[i].name, target_fpu_name))
+	    {
+	      arm_fpu_arch = all_fpus[i].fpu;
+	      arm_fpu_tune = arm_fpu_arch;
+	      arm_fp_model = fp_model_for_fpu[arm_fpu_arch];
+	      break;
+	    }
+	}
+      if (arm_fp_model == ARM_FP_MODEL_UNKNOWN)
+	error ("invalid floating point option: -mfpu=%s", target_fpu_name);
     }
   else
-    /* Default value for floating point code... if no co-processor
-       bus, then schedule for emulated floating point.  Otherwise,
-       assume the user has an FPA.
-       Note: this does not prevent use of floating point instructions,
-       -msoft-float does that.  */
-    arm_fpu_tune = (tune_flags & FL_CO_PROC) ? FPUTYPE_FPA : FPUTYPE_FPA_EMU3;
-  
-  if (target_fp_name)
     {
-      if (streq (target_fp_name, "2"))
+#ifdef FPUTYPE_DEFAULT
+      /* Use the default is it is specified for this platform.  */
+      arm_fpu_arch = FPUTYPE_DEFAULT;
+      arm_fpu_tune = FPUTYPE_DEFAULT;
+#else
+      /* Pick one based on CPU type.  */
+      if ((insn_flags & FL_VFP) != 0)
+	arm_fpu_arch = FPUTYPE_VFP;
+      else if (insn_flags & FL_CIRRUS)
+	arm_fpu_arch = FPUTYPE_MAVERICK;
+      else
 	arm_fpu_arch = FPUTYPE_FPA_EMU2;
-      else if (streq (target_fp_name, "3"))
-	arm_fpu_arch = FPUTYPE_FPA_EMU3;
+#endif
+      if (tune_flags & FL_CO_PROC && arm_fpu_arch == FPUTYPE_FPA_EMU2)
+	arm_fpu_tune = FPUTYPE_FPA;
       else
-	error ("invalid floating point emulation option: -mfpe-%s",
-	       target_fp_name);
+	arm_fpu_tune = arm_fpu_arch;
+      arm_fp_model = fp_model_for_fpu[arm_fpu_arch];
+      if (arm_fp_model == ARM_FP_MODEL_UNKNOWN)
+	abort ();
+    }
+
+  if (target_float_abi_name != NULL)
+    {
+      /* The user specified a FP ABI.  */
+      for (i = 0; i < ARRAY_SIZE (all_float_abis); i++)
+	{
+	  if (streq (all_float_abis[i].name, target_float_abi_name))
+	    {
+	      arm_float_abi = all_float_abis[i].abi_type;
+	      break;
+	    }
+	}
+      if (i == ARRAY_SIZE (all_float_abis))
+	error ("invalid floating point abi: -mfloat-abi=%s",
+	       target_float_abi_name);
     }
   else
-    arm_fpu_arch = FPUTYPE_DEFAULT;
-  
-  if (TARGET_FPE)
     {
-      if (arm_fpu_tune == FPUTYPE_FPA_EMU3)
-	arm_fpu_tune = FPUTYPE_FPA_EMU2;
-      else if (arm_fpu_tune == FPUTYPE_MAVERICK)
-	warning ("-mfpe switch not supported by ep9312 target cpu - ignored.");
-      else if (arm_fpu_tune != FPUTYPE_FPA)
-	arm_fpu_tune = FPUTYPE_FPA_EMU2;
+      /* Use soft-float target flag.  */
+      if (target_flags & ARM_FLAG_SOFT_FLOAT)
+	arm_float_abi = ARM_FLOAT_ABI_SOFT;
+      else
+	arm_float_abi = ARM_FLOAT_ABI_HARD;
     }
+
+  if (arm_float_abi == ARM_FLOAT_ABI_SOFTFP)
+    sorry ("-mfloat-abi=softfp");
+  /* If soft-float is specified then don't use FPU.  */
+  if (TARGET_SOFT_FLOAT)
+    arm_fpu_arch = FPUTYPE_NONE;
   
   /* For arm2/3 there is no need to do any scheduling if there is only
      a floating point emulator, or we are doing software floating-point.  */
-  if ((TARGET_SOFT_FLOAT || arm_fpu_tune != FPUTYPE_FPA)
+  if ((TARGET_SOFT_FLOAT
+       || arm_fpu_tune == FPUTYPE_FPA_EMU2
+       || arm_fpu_tune == FPUTYPE_FPA_EMU3)
       && (tune_flags & FL_MODE32) == 0)
     flag_schedule_insns = flag_schedule_insns_after_reload = 0;
   
@@ -1123,8 +1204,14 @@ use_return_insn (int iscond, rtx sibling)
 
   /* Can't be done if any of the FPA regs are pushed,
      since this also requires an insn.  */
-  if (TARGET_HARD_FLOAT)
-    for (regno = FIRST_ARM_FP_REGNUM; regno <= LAST_ARM_FP_REGNUM; regno++)
+  if (TARGET_HARD_FLOAT && TARGET_FPA)
+    for (regno = FIRST_FPA_REGNUM; regno <= LAST_FPA_REGNUM; regno++)
+      if (regs_ever_live[regno] && !call_used_regs[regno])
+	return 0;
+
+  /* Likewise VFP regs.  */
+  if (TARGET_HARD_FLOAT && TARGET_VFP)
+    for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
       if (regs_ever_live[regno] && !call_used_regs[regno])
 	return 0;
 
@@ -2028,15 +2115,14 @@ arm_return_in_memory (tree type)
 int
 arm_float_words_big_endian (void)
 {
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     return 0;
 
   /* For FPA, float words are always big-endian.  For VFP, floats words
      follow the memory system mode.  */
 
-  if (TARGET_HARD_FLOAT)
+  if (TARGET_FPA)
     {
-      /* FIXME: TARGET_HARD_FLOAT currently implies FPA.  */
       return 1;
     }
 
@@ -2779,6 +2865,24 @@ arm_legitimate_address_p (enum machine_mode mode, rtx x, int strict_p)
 	}
     }
 
+  else if (TARGET_HARD_FLOAT && TARGET_VFP && mode == DFmode)
+    {
+      if (GET_CODE (x) == PLUS
+	  && arm_address_register_rtx_p (XEXP (x, 0), strict_p)
+	  && GET_CODE (XEXP (x, 1)) == CONST_INT)
+	{
+	  HOST_WIDE_INT val = INTVAL (XEXP (x, 1));
+
+	  /* ??? valid arm offsets are a subset of VFP offsets.
+	     For now only allow this subset.  Proper fix is to add an
+	     additional memory constraint for arm address modes.
+	     Alternatively allow full vfp addressing and let
+	     output_move_double fix it up with a sub-optimal sequence.  */
+          if (val == 4 || val == -4 || val == -8)
+	    return 1;
+	}
+    }
+
   else if (GET_CODE (x) == PLUS)
     {
       rtx xop0 = XEXP (x, 0);
@@ -2825,12 +2929,12 @@ arm_legitimate_index_p (enum machine_mode mode, rtx index, int strict_p)
   HOST_WIDE_INT range;
   enum rtx_code code = GET_CODE (index);
 
-  if (TARGET_HARD_FLOAT && GET_MODE_CLASS (mode) == MODE_FLOAT)
+  if (TARGET_HARD_FLOAT && TARGET_FPA && GET_MODE_CLASS (mode) == MODE_FLOAT)
     return (code == CONST_INT && INTVAL (index) < 1024
 	    && INTVAL (index) > -1024
 	    && (INTVAL (index) & 3) == 0);
 
-  if (TARGET_CIRRUS
+  if (TARGET_HARD_FLOAT && TARGET_MAVERICK
       && (GET_MODE_CLASS (mode) == MODE_FLOAT || mode == DImode))
     return (code == CONST_INT
 	    && INTVAL (index) < 255
@@ -3065,7 +3169,10 @@ arm_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode)
 	  rtx base_reg, val;
 	  n = INTVAL (xop1);
 
-	  if (mode == DImode || (TARGET_SOFT_FLOAT && mode == DFmode))
+	  /* VFP addressing modes actually allow greater offsets, but for
+	     now we just stick with the lowest common denominator.  */
+	  if (mode == DImode
+	      || ((TARGET_SOFT_FLOAT || TARGET_VFP) && mode == DFmode))
 	    {
 	      low_n = n & 0x0f;
 	      n &= ~0x0f;
@@ -3135,129 +3242,133 @@ arm_legitimize_address (rtx x, rtx orig_x, enum machine_mode mode)
 #ifndef COSTS_N_INSNS
 #define COSTS_N_INSNS(N) ((N) * 4 - 2)
 #endif
-/* Worker routine for arm_rtx_costs.  */
 static inline int
-arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
+thumb_rtx_costs (rtx x, enum rtx_code code, enum rtx_code outer)
 {
   enum machine_mode mode = GET_MODE (x);
-  enum rtx_code subcode;
-  int extra_cost;
 
-  if (TARGET_THUMB)
+  switch (code)
     {
-      switch (code)
-	{
-	case ASHIFT:
-	case ASHIFTRT:
-	case LSHIFTRT:
-	case ROTATERT:	
-	case PLUS:
-	case MINUS:
-	case COMPARE:
-	case NEG:
-	case NOT:	
-	  return COSTS_N_INSNS (1);
-	  
-	case MULT:							
-	  if (GET_CODE (XEXP (x, 1)) == CONST_INT)			
-	    {								
-	      int cycles = 0;						
-	      unsigned HOST_WIDE_INT i = INTVAL (XEXP (x, 1));
-	      
-	      while (i)						
-		{							
-		  i >>= 2;						
-		  cycles++;						
-		}							
-	      return COSTS_N_INSNS (2) + cycles;			
-	    }
-	  return COSTS_N_INSNS (1) + 16;
-	  
-	case SET:							
-	  return (COSTS_N_INSNS (1)					
-		  + 4 * ((GET_CODE (SET_SRC (x)) == MEM)		
-			 + GET_CODE (SET_DEST (x)) == MEM));
+    case ASHIFT:
+    case ASHIFTRT:
+    case LSHIFTRT:
+    case ROTATERT:	
+    case PLUS:
+    case MINUS:
+    case COMPARE:
+    case NEG:
+    case NOT:	
+      return COSTS_N_INSNS (1);
+      
+    case MULT:							
+      if (GET_CODE (XEXP (x, 1)) == CONST_INT)			
+	{								
+	  int cycles = 0;						
+	  unsigned HOST_WIDE_INT i = INTVAL (XEXP (x, 1));
 	  
-	case CONST_INT:						
-	  if (outer == SET)						
+	  while (i)						
 	    {							
-	      if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256)		
-		return 0;						
-	      if (thumb_shiftable_const (INTVAL (x)))			
-		return COSTS_N_INSNS (2);				
-	      return COSTS_N_INSNS (3);				
-	    }								
-	  else if ((outer == PLUS || outer == COMPARE)
-		   && INTVAL (x) < 256 && INTVAL (x) > -256)		
-	    return 0;
-	  else if (outer == AND
-		   && INTVAL (x) < 256 && INTVAL (x) >= -256)
-	    return COSTS_N_INSNS (1);
-	  else if (outer == ASHIFT || outer == ASHIFTRT		
-		   || outer == LSHIFTRT)				
-	    return 0;							
-	  return COSTS_N_INSNS (2);
-	  
-	case CONST:							
-	case CONST_DOUBLE:						
-	case LABEL_REF:						
-	case SYMBOL_REF:						
-	  return COSTS_N_INSNS (3);
-	  
-	case UDIV:
-	case UMOD:
-	case DIV:
-	case MOD:
-	  return 100;
+	      i >>= 2;						
+	      cycles++;						
+	    }							
+	  return COSTS_N_INSNS (2) + cycles;			
+	}
+      return COSTS_N_INSNS (1) + 16;
+      
+    case SET:							
+      return (COSTS_N_INSNS (1)					
+	      + 4 * ((GET_CODE (SET_SRC (x)) == MEM)		
+		     + GET_CODE (SET_DEST (x)) == MEM));
+      
+    case CONST_INT:						
+      if (outer == SET)						
+	{							
+	  if ((unsigned HOST_WIDE_INT) INTVAL (x) < 256)		
+	    return 0;						
+	  if (thumb_shiftable_const (INTVAL (x)))			
+	    return COSTS_N_INSNS (2);				
+	  return COSTS_N_INSNS (3);				
+	}								
+      else if ((outer == PLUS || outer == COMPARE)
+	       && INTVAL (x) < 256 && INTVAL (x) > -256)		
+	return 0;
+      else if (outer == AND
+	       && INTVAL (x) < 256 && INTVAL (x) >= -256)
+	return COSTS_N_INSNS (1);
+      else if (outer == ASHIFT || outer == ASHIFTRT		
+	       || outer == LSHIFTRT)				
+	return 0;							
+      return COSTS_N_INSNS (2);
+      
+    case CONST:							
+    case CONST_DOUBLE:						
+    case LABEL_REF:						
+    case SYMBOL_REF:						
+      return COSTS_N_INSNS (3);
+      
+    case UDIV:
+    case UMOD:
+    case DIV:
+    case MOD:
+      return 100;
 
-	case TRUNCATE:
-	  return 99;
+    case TRUNCATE:
+      return 99;
 
-	case AND:
-	case XOR:
-	case IOR: 
-	  /* XXX guess.  */
-	  return 8;
-
-	case ADDRESSOF:
-	case MEM:
-	  /* XXX another guess.  */
-	  /* Memory costs quite a lot for the first word, but subsequent words
-	     load at the equivalent of a single insn each.  */
-	  return (10 + 4 * ((GET_MODE_SIZE (mode) - 1) / UNITS_PER_WORD)
-		  + ((GET_CODE (x) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (x))
-		     ? 4 : 0));
-
-	case IF_THEN_ELSE:
-	  /* XXX a guess.  */
-	  if (GET_CODE (XEXP (x, 1)) == PC || GET_CODE (XEXP (x, 2)) == PC)
-	    return 14;
-	  return 2;
+    case AND:
+    case XOR:
+    case IOR: 
+      /* XXX guess. */
+      return 8;
 
-	case ZERO_EXTEND:
-	  /* XXX still guessing.  */
-	  switch (GET_MODE (XEXP (x, 0)))
-	    {
-	    case QImode:
-	      return (1 + (mode == DImode ? 4 : 0)
-		      + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
-	      
-	    case HImode:
-	      return (4 + (mode == DImode ? 4 : 0)
-		      + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
-	      
-	    case SImode:
-	      return (1 + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
+    case ADDRESSOF:
+    case MEM:
+      /* XXX another guess.  */
+      /* Memory costs quite a lot for the first word, but subsequent words
+	 load at the equivalent of a single insn each.  */
+      return (10 + 4 * ((GET_MODE_SIZE (mode) - 1) / UNITS_PER_WORD)
+	      + ((GET_CODE (x) == SYMBOL_REF && CONSTANT_POOL_ADDRESS_P (x))
+		 ? 4 : 0));
+
+    case IF_THEN_ELSE:
+      /* XXX a guess. */
+      if (GET_CODE (XEXP (x, 1)) == PC || GET_CODE (XEXP (x, 2)) == PC)
+	return 14;
+      return 2;
+
+    case ZERO_EXTEND:
+      /* XXX still guessing.  */
+      switch (GET_MODE (XEXP (x, 0)))
+	{
+	case QImode:
+	  return (1 + (mode == DImode ? 4 : 0)
+		  + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
 	  
-	    default:
-	      return 99;
-	    }
+	case HImode:
+	  return (4 + (mode == DImode ? 4 : 0)
+		  + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
 	  
+	case SImode:
+	  return (1 + (GET_CODE (XEXP (x, 0)) == MEM ? 10 : 0));
+      
 	default:
 	  return 99;
 	}
+      
+    default:
+      return 99;
     }
-  
+}
+
+
+/* Worker routine for arm_rtx_costs.  */
+static inline int
+arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
+{
+  enum machine_mode mode = GET_MODE (x);
+  enum rtx_code subcode;
+  int extra_cost;
+
   switch (code)
     {
     case MEM:
@@ -3309,11 +3420,11 @@ arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
       if (GET_MODE_CLASS (mode) == MODE_FLOAT)
 	return (2 + ((REG_OR_SUBREG_REG (XEXP (x, 1))
 		      || (GET_CODE (XEXP (x, 1)) == CONST_DOUBLE
-			  && const_double_rtx_ok_for_fpa (XEXP (x, 1))))
+			  && arm_const_double_rtx (XEXP (x, 1))))
 		     ? 0 : 8)
 		+ ((REG_OR_SUBREG_REG (XEXP (x, 0))
 		    || (GET_CODE (XEXP (x, 0)) == CONST_DOUBLE
-			&& const_double_rtx_ok_for_fpa (XEXP (x, 0))))
+			&& arm_const_double_rtx (XEXP (x, 0))))
 		   ? 0 : 8));
 
       if (((GET_CODE (XEXP (x, 0)) == CONST_INT
@@ -3338,7 +3449,7 @@ arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
 	return (2 + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : 8)
 		+ ((REG_OR_SUBREG_REG (XEXP (x, 1))
 		    || (GET_CODE (XEXP (x, 1)) == CONST_DOUBLE
-			&& const_double_rtx_ok_for_fpa (XEXP (x, 1))))
+			&& arm_const_double_rtx (XEXP (x, 1))))
 		   ? 0 : 8));
 
       /* Fall through */
@@ -3388,65 +3499,11 @@ arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
       return 8;
 
     case MULT:
-      /* There is no point basing this on the tuning, since it is always the
-	 fast variant if it exists at all.  */
-      if (arm_fast_multiply && mode == DImode
-	  && (GET_CODE (XEXP (x, 0)) == GET_CODE (XEXP (x, 1)))
-	  && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND
-	      || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND))
-	return 8;
-
-      if (GET_MODE_CLASS (mode) == MODE_FLOAT
-	  || mode == DImode)
-	return 30;
-
-      if (GET_CODE (XEXP (x, 1)) == CONST_INT)
-	{
-	  unsigned HOST_WIDE_INT i = (INTVAL (XEXP (x, 1))
-				      & (unsigned HOST_WIDE_INT) 0xffffffff);
-	  int cost, const_ok = const_ok_for_arm (i);
-	  int j, booth_unit_size;
-
-	  if (arm_tune_xscale)
-	    {
-	      unsigned HOST_WIDE_INT masked_const;
-
-	      /* The cost will be related to two insns.
-		 First a load of the constant (MOV or LDR), then a multiply.  */
-	      cost = 2;
-	      if (! const_ok)
-		cost += 1;      /* LDR is probably more expensive because
-				   of longer result latency.  */
-	      masked_const = i & 0xffff8000;
-	      if (masked_const != 0 && masked_const != 0xffff8000)
-		{
-		  masked_const = i & 0xf8000000;
-		  if (masked_const == 0 || masked_const == 0xf8000000)
-		    cost += 1;
-		  else
-		    cost += 2;
-		}
-	      return cost;
-	    }
-	  
-	  /* Tune as appropriate.  */ 
-	  cost = const_ok ? 4 : 8;
-	  booth_unit_size = ((tune_flags & FL_FAST_MULT) ? 8 : 2);
-	  for (j = 0; i && j < 32; j += booth_unit_size)
-	    {
-	      i >>= booth_unit_size;
-	      cost += 2;
-	    }
-
-	  return cost;
-	}
-
-      return (((tune_flags & FL_FAST_MULT) ? 8 : 30)
-	      + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : 4)
-	      + (REG_OR_SUBREG_REG (XEXP (x, 1)) ? 0 : 4));
+      /* This should have been handled by the CPU specific routines.  */
+      abort ();
 
     case TRUNCATE:
-      if (arm_fast_multiply && mode == SImode
+      if (arm_arch3m && mode == SImode
 	  && GET_CODE (XEXP (x, 0)) == LSHIFTRT
 	  && GET_CODE (XEXP (XEXP (x, 0), 0)) == MULT
 	  && (GET_CODE (XEXP (XEXP (XEXP (x, 0), 0), 0))
@@ -3527,7 +3584,7 @@ arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
       return 6;
       
     case CONST_DOUBLE:						
-      if (const_double_rtx_ok_for_fpa (x))			
+      if (arm_const_double_rtx (x))
 	return outer == SET ? 2 : -1;			
       else if ((outer == COMPARE || outer == PLUS)	
 	       && neg_const_double_rtx_ok_for_fpa (x))		
@@ -3539,13 +3596,262 @@ arm_rtx_costs_1 (rtx x, enum rtx_code code, enum rtx_code outer)
     }
 }
 
+/* RTX costs for cores with a slow MUL implimentation.  */
+
 static bool
-arm_rtx_costs (rtx x, int code, int outer_code, int *total)
+arm_slowmul_rtx_costs (rtx x, int code, int outer_code, int *total)
 {
-  *total = arm_rtx_costs_1 (x, code, outer_code);
-  return true;
+  enum machine_mode mode = GET_MODE (x);
+
+  if (TARGET_THUMB)
+    {
+      *total = thumb_rtx_costs (x, code, outer_code);
+      return true;
+    }
+  
+  switch (code)
+    {
+    case MULT:
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT
+	  || mode == DImode)
+	{
+	  *total = 30;
+	  return true;
+	}
+
+      if (GET_CODE (XEXP (x, 1)) == CONST_INT)
+	{
+	  unsigned HOST_WIDE_INT i = (INTVAL (XEXP (x, 1))
+				      & (unsigned HOST_WIDE_INT) 0xffffffff);
+	  int cost, const_ok = const_ok_for_arm (i);
+	  int j, booth_unit_size;
+
+	  /* Tune as appropriate.  */ 
+	  cost = const_ok ? 4 : 8;
+	  booth_unit_size = 2;
+	  for (j = 0; i && j < 32; j += booth_unit_size)
+	    {
+	      i >>= booth_unit_size;
+	      cost += 2;
+	    }
+
+	  *total = cost;
+	  return true;
+	}
+
+      *total = 30 + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : 4)
+	          + (REG_OR_SUBREG_REG (XEXP (x, 1)) ? 0 : 4);
+      return true;
+  
+    default:
+      *total = arm_rtx_costs_1 (x, code, outer_code);
+      return true;
+    }
 }
 
+
+/* RTX cost for cores with a fast multiply unit (M variants).  */
+
+static bool
+arm_fastmul_rtx_costs (rtx x, int code, int outer_code, int *total)
+{
+  enum machine_mode mode = GET_MODE (x);
+
+  if (TARGET_THUMB)
+    {
+      *total = thumb_rtx_costs (x, code, outer_code);
+      return true;
+    }
+  
+  switch (code)
+    {
+    case MULT:
+      /* There is no point basing this on the tuning, since it is always the
+	 fast variant if it exists at all.  */
+      if (mode == DImode
+	  && (GET_CODE (XEXP (x, 0)) == GET_CODE (XEXP (x, 1)))
+	  && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND
+	      || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND))
+	{
+	  *total = 8;
+	  return true;
+	}
+      
+
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT
+	  || mode == DImode)
+	{
+	  *total = 30;
+	  return true;
+	}
+
+      if (GET_CODE (XEXP (x, 1)) == CONST_INT)
+	{
+	  unsigned HOST_WIDE_INT i = (INTVAL (XEXP (x, 1))
+				      & (unsigned HOST_WIDE_INT) 0xffffffff);
+	  int cost, const_ok = const_ok_for_arm (i);
+	  int j, booth_unit_size;
+
+	  /* Tune as appropriate.  */ 
+	  cost = const_ok ? 4 : 8;
+	  booth_unit_size = 8;
+	  for (j = 0; i && j < 32; j += booth_unit_size)
+	    {
+	      i >>= booth_unit_size;
+	      cost += 2;
+	    }
+
+	  *total = cost;
+	  return true;
+	}
+
+      *total = 8 + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : 4)
+	         + (REG_OR_SUBREG_REG (XEXP (x, 1)) ? 0 : 4);
+      return true;
+  
+    default:
+      *total = arm_rtx_costs_1 (x, code, outer_code);
+      return true;
+    }
+}
+
+
+/* RTX cost for XScale CPUs.  */
+
+static bool
+arm_xscale_rtx_costs (rtx x, int code, int outer_code, int *total)
+{
+  enum machine_mode mode = GET_MODE (x);
+
+  if (TARGET_THUMB)
+    {
+      *total = thumb_rtx_costs (x, code, outer_code);
+      return true;
+    }
+  
+  switch (code)
+    {
+    case MULT:
+      /* There is no point basing this on the tuning, since it is always the
+	 fast variant if it exists at all.  */
+      if (mode == DImode
+	  && (GET_CODE (XEXP (x, 0)) == GET_CODE (XEXP (x, 1)))
+	  && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND
+	      || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND))
+	{
+	  *total = 8;
+	  return true;
+	}
+      
+
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT
+	  || mode == DImode)
+	{
+	  *total = 30;
+	  return true;
+	}
+
+      if (GET_CODE (XEXP (x, 1)) == CONST_INT)
+	{
+	  unsigned HOST_WIDE_INT i = (INTVAL (XEXP (x, 1))
+				      & (unsigned HOST_WIDE_INT) 0xffffffff);
+	  int cost, const_ok = const_ok_for_arm (i);
+	  unsigned HOST_WIDE_INT masked_const;
+
+	  /* The cost will be related to two insns.
+	     First a load of the constant (MOV or LDR), then a multiply. */
+	  cost = 2;
+	  if (! const_ok)
+	    cost += 1;      /* LDR is probably more expensive because
+			       of longer result latency. */
+	  masked_const = i & 0xffff8000;
+	  if (masked_const != 0 && masked_const != 0xffff8000)
+	    {
+	      masked_const = i & 0xf8000000;
+	      if (masked_const == 0 || masked_const == 0xf8000000)
+		cost += 1;
+	      else
+		cost += 2;
+	    }
+	  *total = cost;
+	  return true;
+	}
+
+      *total = 8 + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : 4)
+		 + (REG_OR_SUBREG_REG (XEXP (x, 1)) ? 0 : 4);
+      return true;
+  
+    default:
+      *total = arm_rtx_costs_1 (x, code, outer_code);
+      return true;
+    }
+}
+
+
+/* RTX costs for 9e (and later) cores.  */
+
+static bool
+arm_9e_rtx_costs (rtx x, int code, int outer_code, int *total)
+{
+  enum machine_mode mode = GET_MODE (x);
+  int nonreg_cost;
+  int cost;
+  
+  if (TARGET_THUMB)
+    {
+      switch (code)
+	{
+	case MULT:
+	  *total = COSTS_N_INSNS (3);
+	  return true;
+	  
+	default:
+	  *total = thumb_rtx_costs (x, code, outer_code);
+	  return true;
+	}
+    }
+  
+  switch (code)
+    {
+    case MULT:
+      /* There is no point basing this on the tuning, since it is always the
+	 fast variant if it exists at all.  */
+      if (mode == DImode
+	  && (GET_CODE (XEXP (x, 0)) == GET_CODE (XEXP (x, 1)))
+	  && (GET_CODE (XEXP (x, 0)) == ZERO_EXTEND
+	      || GET_CODE (XEXP (x, 0)) == SIGN_EXTEND))
+	{
+	  *total = 3;
+	  return true;
+	}
+      
+
+      if (GET_MODE_CLASS (mode) == MODE_FLOAT)
+	{
+	  *total = 30;
+	  return true;
+	}
+      if (mode == DImode)
+	{
+	  cost = 7;
+	  nonreg_cost = 8;
+	}
+      else
+	{
+	  cost = 2;
+	  nonreg_cost = 4;
+	}
+
+
+      *total = cost + (REG_OR_SUBREG_REG (XEXP (x, 0)) ? 0 : nonreg_cost)
+		    + (REG_OR_SUBREG_REG (XEXP (x, 1)) ? 0 : nonreg_cost);
+      return true;
+  
+    default:
+      *total = arm_rtx_costs_1 (x, code, outer_code);
+      return true;
+    }
+}
 /* All address computations that can be done are free, but rtx cost returns
    the same for practically all of them.  So we weight the different types
    of address here in the order (most pref first):
@@ -3623,7 +3929,8 @@ arm_adjust_cost (rtx insn, rtx link, rtx dep, int cost)
 	 operand for INSN.  If we have a shifted input operand and the
 	 instruction we depend on is another ALU instruction, then we may
 	 have to account for an additional stall.  */
-      if (shift_opnum != 0 && attr_type == TYPE_NORMAL)
+      if (shift_opnum != 0
+	  && (attr_type == TYPE_ALU_SHIFT || attr_type == TYPE_ALU_SHIFT_REG))
 	{
 	  rtx shifted_operand;
 	  int opno;
@@ -3681,47 +3988,51 @@ arm_adjust_cost (rtx insn, rtx link, rtx dep, int cost)
   return cost;
 }
 
-static int fpa_consts_inited = 0;
+static int fp_consts_inited = 0;
 
-static const char * const strings_fpa[8] =
+/* Only zero is valid for VFP.  Other values are also valid for FPA.  */
+static const char * const strings_fp[8] =
 {
   "0",   "1",   "2",   "3",
   "4",   "5",   "0.5", "10"
 };
 
-static REAL_VALUE_TYPE values_fpa[8];
+static REAL_VALUE_TYPE values_fp[8];
 
 static void
-init_fpa_table (void)
+init_fp_table (void)
 {
   int i;
   REAL_VALUE_TYPE r;
 
-  for (i = 0; i < 8; i++)
+  if (TARGET_VFP)
+    fp_consts_inited = 1;
+  else
+    fp_consts_inited = 8;
+
+  for (i = 0; i < fp_consts_inited; i++)
     {
-      r = REAL_VALUE_ATOF (strings_fpa[i], DFmode);
-      values_fpa[i] = r;
+      r = REAL_VALUE_ATOF (strings_fp[i], DFmode);
+      values_fp[i] = r;
     }
-
-  fpa_consts_inited = 1;
 }
 
-/* Return TRUE if rtx X is a valid immediate FPA constant.  */
+/* Return TRUE if rtx X is a valid immediate FP constant.  */
 int
-const_double_rtx_ok_for_fpa (rtx x)
+arm_const_double_rtx (rtx x)
 {
   REAL_VALUE_TYPE r;
   int i;
   
-  if (!fpa_consts_inited)
-    init_fpa_table ();
+  if (!fp_consts_inited)
+    init_fp_table ();
   
   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
   if (REAL_VALUE_MINUS_ZERO (r))
     return 0;
 
-  for (i = 0; i < 8; i++)
-    if (REAL_VALUES_EQUAL (r, values_fpa[i]))
+  for (i = 0; i < fp_consts_inited; i++)
+    if (REAL_VALUES_EQUAL (r, values_fp[i]))
       return 1;
 
   return 0;
@@ -3734,8 +4045,8 @@ neg_const_double_rtx_ok_for_fpa (rtx x)
   REAL_VALUE_TYPE r;
   int i;
   
-  if (!fpa_consts_inited)
-    init_fpa_table ();
+  if (!fp_consts_inited)
+    init_fp_table ();
   
   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
   r = REAL_VALUE_NEGATE (r);
@@ -3743,7 +4054,7 @@ neg_const_double_rtx_ok_for_fpa (rtx x)
     return 0;
 
   for (i = 0; i < 8; i++)
-    if (REAL_VALUES_EQUAL (r, values_fpa[i]))
+    if (REAL_VALUES_EQUAL (r, values_fp[i]))
       return 1;
 
   return 0;
@@ -3786,6 +4097,21 @@ arm_hard_register_operand (rtx op, enum machine_mode mode)
 	  && REGNO (op) < FIRST_PSEUDO_REGISTER);
 }
     
+/* An arm register operand.  */
+int
+arm_general_register_operand (rtx op, enum machine_mode mode)
+{
+  if (GET_MODE (op) != mode && mode != VOIDmode)
+    return 0;
+
+  if (GET_CODE (op) == SUBREG)
+    op = SUBREG_REG (op);
+
+  return (GET_CODE (op) == REG
+	  && (REGNO (op) <= LAST_ARM_REGNUM
+	      || REGNO (op) >= FIRST_PSEUDO_REGISTER));
+}
+
 /* Only accept reg, subreg(reg), const_int.  */
 int
 reg_or_int_operand (rtx op, enum machine_mode mode)
@@ -3957,9 +4283,10 @@ f_register_operand (rtx op, enum machine_mode mode)
 	      || REGNO_REG_CLASS (REGNO (op)) == FPA_REGS));
 }
 
-/* Return TRUE for valid operands for the rhs of an FPA instruction.  */
+/* Return TRUE for valid operands for the rhs of an floating point insns.
+   Allows regs or certain consts on FPA, just regs for everything else.  */
 int
-fpa_rhs_operand (rtx op, enum machine_mode mode)
+arm_float_rhs_operand (rtx op, enum machine_mode mode)
 {
   if (s_register_operand (op, mode))
     return TRUE;
@@ -3967,14 +4294,14 @@ fpa_rhs_operand (rtx op, enum machine_mode mode)
   if (GET_MODE (op) != mode && mode != VOIDmode)
     return FALSE;
 
-  if (GET_CODE (op) == CONST_DOUBLE)
-    return const_double_rtx_ok_for_fpa (op);
+  if (TARGET_FPA && GET_CODE (op) == CONST_DOUBLE)
+    return arm_const_double_rtx (op);
 
   return FALSE;
 }
 
 int
-fpa_add_operand (rtx op, enum machine_mode mode)
+arm_float_add_operand (rtx op, enum machine_mode mode)
 {
   if (s_register_operand (op, mode))
     return TRUE;
@@ -3982,13 +4309,27 @@ fpa_add_operand (rtx op, enum machine_mode mode)
   if (GET_MODE (op) != mode && mode != VOIDmode)
     return FALSE;
 
-  if (GET_CODE (op) == CONST_DOUBLE)
-    return (const_double_rtx_ok_for_fpa (op) 
+  if (TARGET_FPA && GET_CODE (op) == CONST_DOUBLE)
+    return (arm_const_double_rtx (op)
 	    || neg_const_double_rtx_ok_for_fpa (op));
 
   return FALSE;
 }
 
+
+/* Return TRUE if OP is suitable for the rhs of a floating point comparison.
+   Depends which fpu we are targeting.  */
+
+int
+arm_float_compare_operand (rtx op, enum machine_mode mode)
+{
+  if (TARGET_VFP)
+    return vfp_compare_operand (op, mode);
+  else
+    return arm_float_rhs_operand (op, mode);
+}
+
+
 /* Return nonzero if OP is a valid Cirrus memory address pattern.  */
 int
 cirrus_memory_offset (rtx op)
@@ -4065,6 +4406,84 @@ cirrus_shift_const (rtx op, enum machine_mode mode ATTRIBUTE_UNUSED)
 	  && INTVAL (op) < 64);
 }
 
+
+/* Return TRUE if OP is a valid VFP memory address pattern.  */
+/* Copied from cirrus_memory_offset but with restricted offset range.  */
+
+int
+vfp_mem_operand (rtx op)
+{
+  /* Reject eliminable registers.  */
+
+  if (! (reload_in_progress || reload_completed)
+      && (   reg_mentioned_p (frame_pointer_rtx, op)
+	  || reg_mentioned_p (arg_pointer_rtx, op)
+	  || reg_mentioned_p (virtual_incoming_args_rtx, op)
+	  || reg_mentioned_p (virtual_outgoing_args_rtx, op)
+	  || reg_mentioned_p (virtual_stack_dynamic_rtx, op)
+	  || reg_mentioned_p (virtual_stack_vars_rtx, op)))
+    return FALSE;
+
+  /* Constants are converted into offets from labels.  */
+  if (GET_CODE (op) == MEM)
+    {
+      rtx ind;
+
+      ind = XEXP (op, 0);
+
+      if (reload_completed
+	  && (GET_CODE (ind) == LABEL_REF
+	      || (GET_CODE (ind) == CONST
+		  && GET_CODE (XEXP (ind, 0)) == PLUS
+		  && GET_CODE (XEXP (XEXP (ind, 0), 0)) == LABEL_REF
+		  && GET_CODE (XEXP (XEXP (ind, 0), 1)) == CONST_INT)))
+	return TRUE;
+
+      /* Match: (mem (reg)).  */
+      if (GET_CODE (ind) == REG)
+	return arm_address_register_rtx_p (ind, 0);
+
+      /* Match:
+	 (mem (plus (reg)
+	            (const))).  */
+      if (GET_CODE (ind) == PLUS
+	  && GET_CODE (XEXP (ind, 0)) == REG
+	  && REG_MODE_OK_FOR_BASE_P (XEXP (ind, 0), VOIDmode)
+	  && GET_CODE (XEXP (ind, 1)) == CONST_INT
+	  && INTVAL (XEXP (ind, 1)) > -1024
+	  && INTVAL (XEXP (ind, 1)) <  1024)
+	return TRUE;
+    }
+
+  return FALSE;
+}
+
+
+/* Return TRUE if OP is a REG or constant zero.  */
+int
+vfp_compare_operand (rtx op, enum machine_mode mode)
+{
+  if (s_register_operand (op, mode))
+    return TRUE;
+
+  return (GET_CODE (op) == CONST_DOUBLE
+	  && arm_const_double_rtx (op));
+}
+
+
+/* Return GENERAL_REGS if a scratch register required to reload x to/from
+   VFP registers.  Otherwise return NO_REGS.  */
+
+enum reg_class
+vfp_secondary_reload_class (enum machine_mode mode, rtx x)
+{
+  if (vfp_mem_operand (x) || s_register_operand (x, mode))
+    return NO_REGS;
+
+  return GENERAL_REGS;
+}
+
+
 /* Returns TRUE if INSN is an "LDR REG, ADDR" instruction.
    Use by the Cirrus Maverick code which has to workaround
    a hardware bug triggered by such instructions.  */
@@ -4300,7 +4719,7 @@ nonimmediate_di_operand (rtx op, enum machine_mode mode)
   return FALSE;
 }
 
-/* Return TRUE for a valid operand of a DFmode operation when -msoft-float.
+/* Return TRUE for a valid operand of a DFmode operation when soft-float.
    Either: REG, SUBREG, CONST_DOUBLE or MEM(DImode_address).
    Note that this disallows MEM(REG+REG), but allows
    MEM(PRE/POST_INC/DEC(REG)).  */
@@ -5649,7 +6068,7 @@ arm_select_cc_mode (enum rtx_code op, rtx x, rtx y)
 	case LE:
 	case GT:
 	case GE:
-	  if (TARGET_CIRRUS)
+	  if (TARGET_HARD_FLOAT && TARGET_MAVERICK)
 	    return CCFPmode;
 	  return CCFPEmode;
 
@@ -7195,13 +7614,13 @@ fp_immediate_constant (rtx x)
   REAL_VALUE_TYPE r;
   int i;
   
-  if (!fpa_consts_inited)
-    init_fpa_table ();
+  if (!fp_consts_inited)
+    init_fp_table ();
   
   REAL_VALUE_FROM_CONST_DOUBLE (r, x);
   for (i = 0; i < 8; i++)
-    if (REAL_VALUES_EQUAL (r, values_fpa[i]))
-      return strings_fpa[i];
+    if (REAL_VALUES_EQUAL (r, values_fp[i]))
+      return strings_fp[i];
 
   abort ();
 }
@@ -7212,12 +7631,12 @@ fp_const_from_val (REAL_VALUE_TYPE *r)
 {
   int i;
 
-  if (!fpa_consts_inited)
-    init_fpa_table ();
+  if (!fp_consts_inited)
+    init_fp_table ();
 
   for (i = 0; i < 8; i++)
-    if (REAL_VALUES_EQUAL (*r, values_fpa[i]))
-      return strings_fpa[i];
+    if (REAL_VALUES_EQUAL (*r, values_fp[i]))
+      return strings_fp[i];
 
   abort ();
 }
@@ -7262,6 +7681,124 @@ print_multi_reg (FILE *stream, const char *instr, int reg, int mask)
   fprintf (stream, "\n");
 }
 
+
+/* Output the operands of a FLDM/FSTM instruction to STREAM.
+   REG is the base register,
+   INSTR is the possibly suffixed load or store instruction.
+   FMT specifies now to print the register name.
+   START and COUNT specify the register range.  */
+
+static void
+vfp_print_multi (FILE *stream, const char *instr, int reg,
+		 const char * fmt, int start, int count)
+{
+  int i;
+
+  fputc ('\t', stream);
+  asm_fprintf (stream, instr, reg);
+  fputs (", {", stream);
+
+  for (i = start; i < start + count; i++)
+    {
+      if (i > start)
+	fputs (", ", stream);
+      asm_fprintf (stream, fmt, i);
+    }
+  fputs ("}\n", stream);
+}
+
+
+/* Output the assembly for a store multiple.  */
+
+const char *
+vfp_output_fstmx (rtx * operands)
+{
+  char pattern[100];
+  int p;
+  int base;
+  int i;
+
+  strcpy (pattern, "fstmfdx\t%m0!, {%P1");
+  p = strlen (pattern);
+
+  if (GET_CODE (operands[1]) != REG)
+    abort ();
+
+  base = (REGNO (operands[1]) - FIRST_VFP_REGNUM) / 2;
+  for (i = 1; i < XVECLEN (operands[2], 0); i++)
+    {
+      p += sprintf (&pattern[p], ", d%d", base + i);
+    }
+  strcpy (&pattern[p], "}");
+
+  output_asm_insn (pattern, operands);
+  return "";
+}
+
+
+/* Emit RTL to save block of VFP register pairs to the stack.  */
+
+static rtx
+vfp_emit_fstmx (int base_reg, int count)
+{
+  rtx par;
+  rtx dwarf;
+  rtx tmp, reg;
+  int i;
+
+  /* ??? The frame layout is implementation defined.  We describe
+     standard format 1 (equivalent to a FSTMD insn and unused pad word).
+     We really need some way of representing the whole block so that the
+     unwinder can figure it out at runtime.  */
+  par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (count));
+  dwarf = gen_rtx_SEQUENCE (VOIDmode, rtvec_alloc (count + 1));
+
+  reg = gen_rtx_REG (DFmode, base_reg);
+  base_reg += 2;
+
+  XVECEXP (par, 0, 0)
+    = gen_rtx_SET (VOIDmode,
+		   gen_rtx_MEM (BLKmode,
+				gen_rtx_PRE_DEC (BLKmode, stack_pointer_rtx)),
+		   gen_rtx_UNSPEC (BLKmode,
+				   gen_rtvec (1, reg),
+				   UNSPEC_PUSH_MULT));
+
+  tmp = gen_rtx_SET (VOIDmode, stack_pointer_rtx,
+		     gen_rtx_PLUS (SImode, stack_pointer_rtx,
+				   GEN_INT (-(count * 8 + 4))));
+  RTX_FRAME_RELATED_P (tmp) = 1;
+  XVECEXP (dwarf, 0, 0) = tmp;
+
+  tmp = gen_rtx_SET (VOIDmode,
+		     gen_rtx_MEM (DFmode, stack_pointer_rtx),
+		     reg);
+  RTX_FRAME_RELATED_P (tmp) = 1;
+  XVECEXP (dwarf, 0, 1) = tmp;
+
+  for (i = 1; i < count; i++)
+    {
+      reg = gen_rtx_REG (DFmode, base_reg);
+      base_reg += 2;
+      XVECEXP (par, 0, i) = gen_rtx_USE (VOIDmode, reg);
+
+      tmp = gen_rtx_SET (VOIDmode,
+			 gen_rtx_MEM (DFmode,
+				      gen_rtx_PLUS (SImode,
+						    stack_pointer_rtx,
+						    GEN_INT (i * 8))),
+			 reg);
+      RTX_FRAME_RELATED_P (tmp) = 1;
+      XVECEXP (dwarf, 0, i + 1) = tmp;
+    }
+
+  par = emit_insn (par);
+  REG_NOTES (par) = gen_rtx_EXPR_LIST (REG_FRAME_RELATED_EXPR, dwarf,
+				       REG_NOTES (par));
+  return par;
+}
+
+
 /* Output a 'call' insn.  */
 const char *
 output_call (rtx *operands)
@@ -8546,6 +9083,7 @@ arm_output_epilogue (rtx sibling)
   rtx eh_ofs = cfun->machine->eh_epilogue_sp_ofs;
   unsigned int lrm_count = 0;
   int really_return = (sibling == NULL);
+  int start_reg;
 
   /* If we have already generated the return instruction
      then it is futile to generate anything else.  */
@@ -8591,11 +9129,12 @@ arm_output_epilogue (rtx sibling)
   
   if (frame_pointer_needed)
     {
+      /* This variable is for the Virtual Frame Pointer, not VFP regs.  */
       int vfp_offset = 4;
 
       if (arm_fpu_arch == FPUTYPE_FPA_EMU2)
 	{
-	  for (reg = LAST_ARM_FP_REGNUM; reg >= FIRST_ARM_FP_REGNUM; reg--)
+	  for (reg = LAST_FPA_REGNUM; reg >= FIRST_FPA_REGNUM; reg--)
 	    if (regs_ever_live[reg] && !call_used_regs[reg])
 	      {
 		floats_offset += 12;
@@ -8605,9 +9144,9 @@ arm_output_epilogue (rtx sibling)
 	}
       else
 	{
-	  int start_reg = LAST_ARM_FP_REGNUM;
+	  start_reg = LAST_FPA_REGNUM;
 
-	  for (reg = LAST_ARM_FP_REGNUM; reg >= FIRST_ARM_FP_REGNUM; reg--)
+	  for (reg = LAST_FPA_REGNUM; reg >= FIRST_FPA_REGNUM; reg--)
 	    {
 	      if (regs_ever_live[reg] && !call_used_regs[reg])
 		{
@@ -8638,6 +9177,62 @@ arm_output_epilogue (rtx sibling)
 			 FP_REGNUM, floats_offset - vfp_offset);
 	}
 
+      if (TARGET_HARD_FLOAT && TARGET_VFP)
+	{
+	  int nregs = 0;
+
+	  /* We save regs in pairs.  */
+	  /* A special insn for saving/restoring VFP registers.  This does
+	     not have base+offset addressing modes, so we use IP to
+	     hold the address.  Each block requires nregs*2+1 words.  */
+	  start_reg = FIRST_VFP_REGNUM;
+	  /* Cound how many blocks of registers need saving.  */
+	  for (reg = FIRST_VFP_REGNUM; reg < LAST_VFP_REGNUM; reg += 2)
+	    {
+	      if ((!regs_ever_live[reg] || call_used_regs[reg])
+		  && (!regs_ever_live[reg + 1] || call_used_regs[reg + 1]))
+		{
+		  if (start_reg != reg)
+		    floats_offset += 4;
+		  start_reg = reg + 2;
+		}
+	      else
+		{
+		  floats_offset += 8;
+		  nregs++;
+		}
+	    }
+	  if (start_reg != reg)
+	    floats_offset += 4;
+
+	  if (nregs > 0)
+	    {
+	      asm_fprintf (f, "\tsub\t%r, %r, #%d\n", IP_REGNUM,
+			   FP_REGNUM, floats_offset - vfp_offset);
+	    }
+	  start_reg = FIRST_VFP_REGNUM;
+	  for (reg = FIRST_VFP_REGNUM; reg < LAST_VFP_REGNUM; reg += 2)
+	    {
+	      if ((!regs_ever_live[reg] || call_used_regs[reg])
+		  && (!regs_ever_live[reg + 1] || call_used_regs[reg + 1]))
+		{
+		  if (start_reg != reg)
+		    {
+		      vfp_print_multi (f, "fldmfdx\t%r!", IP_REGNUM, "d%d",
+				       (start_reg - FIRST_VFP_REGNUM) / 2,
+				       (reg - start_reg) / 2);
+		    }
+		  start_reg = reg + 2;
+		}
+	    }
+	  if (start_reg != reg)
+	    {
+	      vfp_print_multi (f, "fldmfdx\t%r!", IP_REGNUM, "d%d",
+			       (start_reg - FIRST_VFP_REGNUM) / 2,
+			       (reg - start_reg) / 2);
+	    }
+	}
+
       if (TARGET_IWMMXT)
 	{
 	  /* The frame pointer is guaranteed to be non-double-word aligned.
@@ -8712,16 +9307,16 @@ arm_output_epilogue (rtx sibling)
 
       if (arm_fpu_arch == FPUTYPE_FPA_EMU2)
 	{
-	  for (reg = FIRST_ARM_FP_REGNUM; reg <= LAST_ARM_FP_REGNUM; reg++)
+	  for (reg = FIRST_FPA_REGNUM; reg <= LAST_FPA_REGNUM; reg++)
 	    if (regs_ever_live[reg] && !call_used_regs[reg])
 	      asm_fprintf (f, "\tldfe\t%r, [%r], #12\n",
 			   reg, SP_REGNUM);
 	}
       else
 	{
-	  int start_reg = FIRST_ARM_FP_REGNUM;
+	  start_reg = FIRST_FPA_REGNUM;
 
-	  for (reg = FIRST_ARM_FP_REGNUM; reg <= LAST_ARM_FP_REGNUM; reg++)
+	  for (reg = FIRST_FPA_REGNUM; reg <= LAST_FPA_REGNUM; reg++)
 	    {
 	      if (regs_ever_live[reg] && !call_used_regs[reg])
 		{
@@ -8749,6 +9344,30 @@ arm_output_epilogue (rtx sibling)
 			 start_reg, reg - start_reg, SP_REGNUM);
 	}
 
+      if (TARGET_HARD_FLOAT && TARGET_VFP)
+	{
+	  start_reg = FIRST_VFP_REGNUM;
+	  for (reg = FIRST_VFP_REGNUM; reg < LAST_VFP_REGNUM; reg += 2)
+	    {
+	      if ((!regs_ever_live[reg] || call_used_regs[reg])
+		  && (!regs_ever_live[reg + 1] || call_used_regs[reg + 1]))
+		{
+		  if (start_reg != reg)
+		    {
+		      vfp_print_multi (f, "fldmfdx\t%r!", SP_REGNUM, "d%d",
+				       (start_reg - FIRST_VFP_REGNUM) / 2,
+				       (reg - start_reg) / 2);
+		    }
+		  start_reg = reg + 2;
+		}
+	    }
+	  if (start_reg != reg)
+	    {
+	      vfp_print_multi (f, "fldmfdx\t%r!", SP_REGNUM, "d%d",
+			       (start_reg - FIRST_VFP_REGNUM) / 2,
+			       (reg - start_reg) / 2);
+	    }
+	}
       if (TARGET_IWMMXT)
 	for (reg = FIRST_IWMMXT_REGNUM; reg <= LAST_IWMMXT_REGNUM; reg++)
 	  if (regs_ever_live[reg] && !call_used_regs[reg])
@@ -9051,6 +9670,7 @@ emit_sfm (int base_reg, int count)
   return par;
 }
 
+
 /* Compute the distance from register FROM to register TO.
    These can be the arg pointer (26), the soft frame pointer (25),
    the stack pointer (13) or the hard frame pointer (11).
@@ -9112,6 +9732,7 @@ arm_compute_initial_elimination_offset (unsigned int from, unsigned int to)
     {
       unsigned int reg_mask;
       unsigned int reg;
+      bool new_block;
 
       /* Make sure that we compute which registers will be saved
 	 on the stack using the same algorithm that is used by
@@ -9132,10 +9753,31 @@ arm_compute_initial_elimination_offset (unsigned int from, unsigned int to)
       /* If the hard floating point registers are going to be
 	 used then they must be saved on the stack as well.
          Each register occupies 12 bytes of stack space.  */
-      for (reg = FIRST_ARM_FP_REGNUM; reg <= LAST_ARM_FP_REGNUM; reg++)
+      for (reg = FIRST_FPA_REGNUM; reg <= LAST_FPA_REGNUM; reg++)
 	if (regs_ever_live[reg] && ! call_used_regs[reg])
 	  call_saved_registers += 12;
 
+      /* Likewise VFP regs.  */
+      if (TARGET_HARD_FLOAT && TARGET_VFP)
+	{
+	  new_block = TRUE;
+	  for (reg = FIRST_VFP_REGNUM; reg < LAST_VFP_REGNUM; reg += 2)
+	    {
+	      if ((regs_ever_live[reg] && !call_used_regs[reg])
+		  || (regs_ever_live[reg + 1] && !call_used_regs[reg + 1]))
+		{
+		  if (new_block)
+		    {
+		      call_saved_registers += 4;
+		      new_block = FALSE;
+		    }
+		  call_saved_registers += 8;
+		}
+	      else
+		new_block = TRUE;
+	    }
+	}
+
       if (TARGET_REALLY_IWMMXT)
 	/* Check for the call-saved iWMMXt registers.  */
 	for (reg = FIRST_IWMMXT_REGNUM; reg <= LAST_IWMMXT_REGNUM; reg++)
@@ -9231,6 +9873,7 @@ arm_get_frame_size (void)
   int entry_size = 0;
   unsigned long func_type = arm_current_func_type ();
   int leaf;
+  bool new_block;
 
   if (! TARGET_ARM)
     abort();
@@ -9274,12 +9917,33 @@ arm_get_frame_size (void)
   /* Space for saved registers.  */
   entry_size += bit_count (arm_compute_save_reg_mask ()) * 4;
 
-  /* Space for saved FPA registers.  */
   if (! IS_VOLATILE (func_type))
     {
-      for (regno = FIRST_ARM_FP_REGNUM; regno <= LAST_ARM_FP_REGNUM; regno++)
+      /* Space for saved FPA registers.  */
+      for (regno = FIRST_FPA_REGNUM; regno <= LAST_FPA_REGNUM; regno++)
       if (regs_ever_live[regno] && ! call_used_regs[regno])
 	entry_size += 12;
+
+      /* Space for saved VFP registers.  */
+      if (TARGET_HARD_FLOAT && TARGET_VFP)
+	{
+	  new_block = TRUE;
+	  for (regno = FIRST_VFP_REGNUM; regno < LAST_VFP_REGNUM; regno += 2)
+	    {
+	      if ((regs_ever_live[regno] && !call_used_regs[regno])
+		  || (regs_ever_live[regno + 1] && !call_used_regs[regno + 1]))
+		{
+		  if (new_block)
+		    {
+		      entry_size += 4;
+		      new_block = FALSE;
+		    }
+		  entry_size += 8;
+		}
+	      else
+		new_block = TRUE;
+	    }
+	}
     }
 
   if (TARGET_REALLY_IWMMXT)
@@ -9474,11 +10138,13 @@ arm_expand_prologue (void)
 
   if (! IS_VOLATILE (func_type))
     {
+      int start_reg;
+
       /* Save any floating point call-saved registers used by this
 	 function.  */
       if (arm_fpu_arch == FPUTYPE_FPA_EMU2)
 	{
-	  for (reg = LAST_ARM_FP_REGNUM; reg >= FIRST_ARM_FP_REGNUM; reg--)
+	  for (reg = LAST_FPA_REGNUM; reg >= FIRST_FPA_REGNUM; reg--)
 	    if (regs_ever_live[reg] && !call_used_regs[reg])
 	      {
 		insn = gen_rtx_PRE_DEC (XFmode, stack_pointer_rtx);
@@ -9490,9 +10156,9 @@ arm_expand_prologue (void)
 	}
       else
 	{
-	  int start_reg = LAST_ARM_FP_REGNUM;
+	  start_reg = LAST_FPA_REGNUM;
 
-	  for (reg = LAST_ARM_FP_REGNUM; reg >= FIRST_ARM_FP_REGNUM; reg--)
+	  for (reg = LAST_FPA_REGNUM; reg >= FIRST_FPA_REGNUM; reg--)
 	    {
 	      if (regs_ever_live[reg] && !call_used_regs[reg])
 		{
@@ -9520,6 +10186,31 @@ arm_expand_prologue (void)
 	      RTX_FRAME_RELATED_P (insn) = 1;
 	    }
 	}
+      if (TARGET_HARD_FLOAT && TARGET_VFP)
+	{
+	  start_reg = FIRST_VFP_REGNUM;
+
+ 	  for (reg = FIRST_VFP_REGNUM; reg < LAST_VFP_REGNUM; reg += 2)
+	    {
+	      if ((!regs_ever_live[reg] || call_used_regs[reg])
+		  && (!regs_ever_live[reg + 1] || call_used_regs[reg + 1]))
+		{
+		  if (start_reg != reg)
+		    {
+		      insn = vfp_emit_fstmx (start_reg,
+					    (reg - start_reg) / 2);
+		      RTX_FRAME_RELATED_P (insn) = 1;
+		    }
+		  start_reg = reg + 2;
+		}
+	    }
+	  if (start_reg != reg)
+	    {
+	      insn = vfp_emit_fstmx (start_reg,
+				    (reg - start_reg) / 2);
+	      RTX_FRAME_RELATED_P (insn) = 1;
+	    }
+	}
     }
 
   if (frame_pointer_needed)
@@ -9839,6 +10530,27 @@ arm_print_operand (FILE *stream, rtx x, int code)
 	}
       return;
 
+      /* Print a VFP double precision register name.  */
+    case 'P':
+      {
+	int mode = GET_MODE (x);
+	int num;
+
+	if (mode != DImode && mode != DFmode)
+	  abort ();
+
+	if (GET_CODE (x) != REG
+	    || !IS_VFP_REGNUM (REGNO (x)))
+	  abort ();
+
+	num = REGNO(x) - FIRST_VFP_REGNUM;
+	if (num & 1)
+	  abort ();
+
+	fprintf (stream, "d%d", num >> 1);
+      }
+      return;
+
     default:
       if (x == 0)
 	abort ();
@@ -10434,7 +11146,7 @@ int
 arm_hard_regno_mode_ok (unsigned int regno, enum machine_mode mode)
 {
   if (GET_MODE_CLASS (mode) == MODE_CC)
-    return regno == CC_REGNUM;
+    return regno == CC_REGNUM || regno == VFPCC_REGNUM;
   
   if (TARGET_THUMB)
     /* For the Thumb we only allow values bigger than SImode in
@@ -10452,6 +11164,17 @@ arm_hard_regno_mode_ok (unsigned int regno, enum machine_mode mode)
        get sign extended to 64bits-- aldyh.  */
     return (GET_MODE_CLASS (mode) == MODE_FLOAT) || (mode == DImode);
 
+  if (IS_VFP_REGNUM (regno))
+    {
+      if (mode == SFmode || mode == SImode)
+	return TRUE;
+
+      /* DFmode values are only valid in even register pairs.  */
+      if (mode == DFmode)
+	return ((regno - FIRST_VFP_REGNUM) & 1) == 0;
+      return FALSE;
+    }
+
   if (IS_IWMMXT_GR_REGNUM (regno))
     return mode == SImode;
 
@@ -10470,8 +11193,8 @@ arm_hard_regno_mode_ok (unsigned int regno, enum machine_mode mode)
   /* The only registers left are the FPA registers
      which we only allow to hold FP values.  */
   return GET_MODE_CLASS (mode) == MODE_FLOAT
-    && regno >= FIRST_ARM_FP_REGNUM
-    && regno <= LAST_ARM_FP_REGNUM;
+    && regno >= FIRST_FPA_REGNUM
+    && regno <= LAST_FPA_REGNUM;
 }
 
 int
@@ -10493,12 +11216,15 @@ arm_regno_class (int regno)
       || regno == ARG_POINTER_REGNUM)
     return GENERAL_REGS;
   
-  if (regno == CC_REGNUM)
+  if (regno == CC_REGNUM || regno == VFPCC_REGNUM)
     return NO_REGS;
 
   if (IS_CIRRUS_REGNUM (regno))
     return CIRRUS_REGS;
 
+  if (IS_VFP_REGNUM (regno))
+    return VFP_REGS;
+
   if (IS_IWMMXT_REGNUM (regno))
     return IWMMXT_REGS;
 
@@ -10694,13 +11420,13 @@ static const struct builtin_description bdesc_2arg[] =
   IWMMXT_BUILTIN2 (lshrv2si3_di,    WSRLW)
   IWMMXT_BUILTIN2 (lshrv2si3,       WSRLWI)
   IWMMXT_BUILTIN2 (lshrdi3_di,      WSRLD)
-  IWMMXT_BUILTIN2 (lshrdi3,         WSRLDI)
+  IWMMXT_BUILTIN2 (lshrdi3_iwmmxt,  WSRLDI)
   IWMMXT_BUILTIN2 (ashrv4hi3_di,    WSRAH)
   IWMMXT_BUILTIN2 (ashrv4hi3,       WSRAHI)
   IWMMXT_BUILTIN2 (ashrv2si3_di,    WSRAW)
   IWMMXT_BUILTIN2 (ashrv2si3,       WSRAWI)
   IWMMXT_BUILTIN2 (ashrdi3_di,      WSRAD)
-  IWMMXT_BUILTIN2 (ashrdi3,         WSRADI)
+  IWMMXT_BUILTIN2 (ashrdi3_iwmmxt,  WSRADI)
   IWMMXT_BUILTIN2 (rorv4hi3_di,     WRORH)
   IWMMXT_BUILTIN2 (rorv4hi3,        WRORHI)
   IWMMXT_BUILTIN2 (rorv2si3_di,     WRORW)
@@ -13262,6 +13988,8 @@ arm_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
 		     HOST_WIDE_INT vcall_offset ATTRIBUTE_UNUSED,
 		     tree function)
 {
+  static int thunk_label = 0;
+  char label[256];
   int mi_delta = delta;
   const char *const mi_op = mi_delta < 0 ? "sub" : "add";
   int shift = 0;
@@ -13269,6 +13997,14 @@ arm_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
                     ? 1 : 0);
   if (mi_delta < 0)
     mi_delta = - mi_delta;
+  if (TARGET_THUMB)
+    {
+      int labelno = thunk_label++;
+      ASM_GENERATE_INTERNAL_LABEL (label, "LTHUMBFUNC", labelno);
+      fputs ("\tldr\tr12, ", file);
+      assemble_name (file, label);
+      fputc ('\n', file);
+    }
   while (mi_delta != 0)
     {
       if ((mi_delta & (3 << shift)) == 0)
@@ -13282,11 +14018,22 @@ arm_output_mi_thunk (FILE *file, tree thunk ATTRIBUTE_UNUSED,
           shift += 8;
         }
     }
-  fputs ("\tb\t", file);
-  assemble_name (file, XSTR (XEXP (DECL_RTL (function), 0), 0));
-  if (NEED_PLT_RELOC)
-    fputs ("(PLT)", file);
-  fputc ('\n', file);
+  if (TARGET_THUMB)
+    {
+      fprintf (file, "\tbx\tr12\n");
+      ASM_OUTPUT_ALIGN (file, 2);
+      assemble_name (file, label);
+      fputs (":\n", file);
+      assemble_integer (XEXP (DECL_RTL (function), 0), 4, BITS_PER_WORD, 1);
+    }
+  else
+    {
+      fputs ("\tb\t", file);
+      assemble_name (file, XSTR (XEXP (DECL_RTL (function), 0), 0));
+      if (NEED_PLT_RELOC)
+        fputs ("(PLT)", file);
+      fputc ('\n', file);
+    }
 }
 
 int
@@ -13384,3 +14131,117 @@ arm_setup_incoming_varargs (CUMULATIVE_ARGS *cum,
   if (cum->nregs < NUM_ARG_REGS)
     *pretend_size = (NUM_ARG_REGS - cum->nregs) * UNITS_PER_WORD;
 }
+
+/* Return non-zero if the CONSUMER instruction (a store) does not need
+   PRODUCER's value to calculate the address.  */
+
+int
+arm_no_early_store_addr_dep (rtx producer, rtx consumer)
+{
+  rtx value = PATTERN (producer);
+  rtx addr = PATTERN (consumer);
+
+  if (GET_CODE (value) == COND_EXEC)
+    value = COND_EXEC_CODE (value);
+  if (GET_CODE (value) == PARALLEL)
+    value = XVECEXP (value, 0, 0);
+  value = XEXP (value, 0);
+  if (GET_CODE (addr) == COND_EXEC)
+    addr = COND_EXEC_CODE (addr);
+  if (GET_CODE (addr) == PARALLEL)
+    addr = XVECEXP (addr, 0, 0);
+  addr = XEXP (addr, 0);
+  
+  return !reg_overlap_mentioned_p (value, addr);
+}
+
+/* Return non-zero if the CONSUMER instruction (an ALU op) does not
+   have an early register shift value or amount dependency on the
+   result of PRODUCER.  */
+
+int
+arm_no_early_alu_shift_dep (rtx producer, rtx consumer)
+{
+  rtx value = PATTERN (producer);
+  rtx op = PATTERN (consumer);
+  rtx early_op;
+
+  if (GET_CODE (value) == COND_EXEC)
+    value = COND_EXEC_CODE (value);
+  if (GET_CODE (value) == PARALLEL)
+    value = XVECEXP (value, 0, 0);
+  value = XEXP (value, 0);
+  if (GET_CODE (op) == COND_EXEC)
+    op = COND_EXEC_CODE (op);
+  if (GET_CODE (op) == PARALLEL)
+    op = XVECEXP (op, 0, 0);
+  op = XEXP (op, 1);
+  
+  early_op = XEXP (op, 0);
+  /* This is either an actual independent shift, or a shift applied to
+     the first operand of another operation.  We want the whole shift
+     operation.  */
+  if (GET_CODE (early_op) == REG)
+    early_op = op;
+
+  return !reg_overlap_mentioned_p (value, early_op);
+}
+
+/* Return non-zero if the CONSUMER instruction (an ALU op) does not
+   have an early register shift value dependency on the result of
+   PRODUCER.  */
+
+int
+arm_no_early_alu_shift_value_dep (rtx producer, rtx consumer)
+{
+  rtx value = PATTERN (producer);
+  rtx op = PATTERN (consumer);
+  rtx early_op;
+
+  if (GET_CODE (value) == COND_EXEC)
+    value = COND_EXEC_CODE (value);
+  if (GET_CODE (value) == PARALLEL)
+    value = XVECEXP (value, 0, 0);
+  value = XEXP (value, 0);
+  if (GET_CODE (op) == COND_EXEC)
+    op = COND_EXEC_CODE (op);
+  if (GET_CODE (op) == PARALLEL)
+    op = XVECEXP (op, 0, 0);
+  op = XEXP (op, 1);
+  
+  early_op = XEXP (op, 0);
+
+  /* This is either an actual independent shift, or a shift applied to
+     the first operand of another operation.  We want the value being
+     shifted, in either case.  */
+  if (GET_CODE (early_op) != REG)
+    early_op = XEXP (early_op, 0);
+  
+  return !reg_overlap_mentioned_p (value, early_op);
+}
+
+/* Return non-zero if the CONSUMER (a mul or mac op) does not
+   have an early register mult dependency on the result of
+   PRODUCER.  */
+
+int
+arm_no_early_mul_dep (rtx producer, rtx consumer)
+{
+  rtx value = PATTERN (producer);
+  rtx op = PATTERN (consumer);
+
+  if (GET_CODE (value) == COND_EXEC)
+    value = COND_EXEC_CODE (value);
+  if (GET_CODE (value) == PARALLEL)
+    value = XVECEXP (value, 0, 0);
+  value = XEXP (value, 0);
+  if (GET_CODE (op) == COND_EXEC)
+    op = COND_EXEC_CODE (op);
+  if (GET_CODE (op) == PARALLEL)
+    op = XVECEXP (op, 0, 0);
+  op = XEXP (op, 1);
+  
+  return (GET_CODE (op) == PLUS
+	  && !reg_overlap_mentioned_p (value, XEXP (op, 0)));
+}
+
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 75b14455878..7b6a79cd1b5 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -30,9 +30,10 @@
 #define TARGET_CPU_CPP_BUILTINS()			\
   do							\
     {							\
-	if (TARGET_ARM)					\
-	  builtin_define ("__arm__");			\
-	else						\
+	/* Define __arm__ even when in thumb mode, for	\
+	   consistency with armcc.  */			\
+	builtin_define ("__arm__");			\
+	if (TARGET_THUMB)				\
 	  builtin_define ("__thumb__");			\
 							\
 	if (TARGET_BIG_END)				\
@@ -58,9 +59,7 @@
 	if (TARGET_SOFT_FLOAT)				\
 	  builtin_define ("__SOFTFP__");		\
 							\
-	/* FIXME: TARGET_HARD_FLOAT currently implies	\
-	   FPA.  */					\
-	if (TARGET_VFP && !TARGET_HARD_FLOAT)		\
+	if (TARGET_VFP)					\
 	  builtin_define ("__VFP_FP__");		\
 							\
 	/* Add a define for interworking.		\
@@ -98,13 +97,27 @@
 #define TARGET_CPU_xscale       0x0100
 #define TARGET_CPU_ep9312	0x0200
 #define TARGET_CPU_iwmmxt	0x0400
-#define TARGET_CPU_arm926ej_s   0x0800
-#define TARGET_CPU_arm1026ej_s  0x1000
-#define TARGET_CPU_arm1136j_s   0x2000
-#define TARGET_CPU_arm1136jf_s  0x4000
+#define TARGET_CPU_arm926ejs	0x0800
+#define TARGET_CPU_arm1026ejs	0x1000
+#define TARGET_CPU_arm1136js	0x2000
+#define TARGET_CPU_arm1136jfs	0x4000
 /* Configure didn't specify.  */
 #define TARGET_CPU_generic	0x8000
 
+/* The various ARM cores.  */
+enum processor_type
+{
+#define ARM_CORE(NAME, FLAGS, COSTS) \
+  NAME,
+#include "arm-cores.def"
+#undef ARM_CORE
+  /* Used to indicate that no processor has been specified.  */
+  arm_none
+};
+
+/* The processor for which instructions should be scheduled.  */
+extern enum processor_type arm_tune;
+
 typedef enum arm_cond_code
 {
   ARM_EQ = 0, ARM_NE, ARM_CS, ARM_CC, ARM_MI, ARM_PL, ARM_VS, ARM_VC,
@@ -121,8 +134,12 @@ extern int arm_ccfsm_state;
 extern GTY(()) rtx arm_target_insn;
 /* Run-time compilation parameters selecting different hardware subsets.  */
 extern int target_flags;
-/* The floating point instruction architecture, can be 2 or 3 */
-extern const char * target_fp_name;
+/* The floating point mode.  */
+extern const char *target_fpu_name;
+/* For backwards compatability.  */
+extern const char *target_fpe_name;
+/* Whether to use floating point hardware.  */
+extern const char *target_float_abi_name;
 /* Define the information needed to generate branch insns.  This is
    stored from the compare operation.  */
 extern GTY(()) rtx arm_compare_op0;
@@ -181,6 +198,14 @@ extern GTY(()) rtx aof_pic_label;
 #if TARGET_CPU_DEFAULT == TARGET_CPU_iwmmxt
 #define CPP_ARCH_DEFAULT_SPEC "-D__ARM_ARCH_5TE__ -D__XSCALE__ -D__IWMMXT__"
 #else
+#if (TARGET_CPU_DEFAULT == TARGET_CPU_arm926ejs || \
+     TARGET_CPU_DEFAULT == TARGET_CPU_arm1026ejs)
+#define CPP_ARCH_DEFAULT_SPEC "-D__ARM_ARCH_5TEJ__"
+#else
+#if (TARGET_CPU_DEFAULT == TARGET_CPU_arm1136js || \
+     TARGET_CPU_DEFAULT == TARGET_CPU_arm1136jfs)
+#define CPP_ARCH_DEFAULT_SPEC "-D__ARM_ARCH_6J__"
+#else
 #error Unrecognized value in TARGET_CPU_DEFAULT.
 #endif
 #endif
@@ -190,6 +215,8 @@ extern GTY(()) rtx aof_pic_label;
 #endif
 #endif
 #endif
+#endif
+#endif
 
 #undef  CPP_SPEC
 #define CPP_SPEC "%(cpp_cpu_arch) %(subtarget_cpp_spec)			\
@@ -225,7 +252,11 @@ extern GTY(()) rtx aof_pic_label;
 %{march=arm9:-D__ARM_ARCH_4T__} \
 %{march=arm920:-D__ARM_ARCH_4__} \
 %{march=arm920t:-D__ARM_ARCH_4T__} \
+%{march=arm926ejs:-D__ARM_ARCH_5TEJ__} \
 %{march=arm9tdmi:-D__ARM_ARCH_4T__} \
+%{march=arm1026ejs:-D__ARM_ARCH_5TEJ__} \
+%{march=arm1136js:-D__ARM_ARCH_6J__} \
+%{march=arm1136jfs:-D__ARM_ARCH_6J__} \
 %{march=strongarm:-D__ARM_ARCH_4__} \
 %{march=strongarm110:-D__ARM_ARCH_4__} \
 %{march=strongarm1100:-D__ARM_ARCH_4__} \
@@ -243,6 +274,8 @@ extern GTY(()) rtx aof_pic_label;
 %{march=armv5t:-D__ARM_ARCH_5T__} \
 %{march=armv5e:-D__ARM_ARCH_5E__} \
 %{march=armv5te:-D__ARM_ARCH_5TE__} \
+%{march=armv6:-D__ARM_ARCH6__} \
+%{march=armv6j:-D__ARM_ARCH6J__} \
 %{!march=*: \
  %{mcpu=arm2:-D__ARM_ARCH_2__} \
  %{mcpu=arm250:-D__ARM_ARCH_2__} \
@@ -266,7 +299,11 @@ extern GTY(()) rtx aof_pic_label;
  %{mcpu=arm9:-D__ARM_ARCH_4T__} \
  %{mcpu=arm920:-D__ARM_ARCH_4__} \
  %{mcpu=arm920t:-D__ARM_ARCH_4T__} \
+ %{mcpu=arm926ejs:-D__ARM_ARCH_5TEJ__} \
  %{mcpu=arm9tdmi:-D__ARM_ARCH_4T__} \
+ %{mcpu=arm1026ejs:-D__ARM_ARCH_5TEJ__} \
+ %{mcpu=arm1136js:-D__ARM_ARCH_6J__} \
+ %{mcpu=arm1136jfs:-D__ARM_ARCH_6J__} \
  %{mcpu=strongarm:-D__ARM_ARCH_4__} \
  %{mcpu=strongarm110:-D__ARM_ARCH_4__} \
  %{mcpu=strongarm1100:-D__ARM_ARCH_4__} \
@@ -414,13 +451,14 @@ extern GTY(()) rtx aof_pic_label;
 #define TARGET_APCS_REENT		(target_flags & ARM_FLAG_APCS_REENT)
 #define TARGET_ATPCS			(target_flags & ARM_FLAG_ATPCS)
 #define TARGET_MMU_TRAPS		(target_flags & ARM_FLAG_MMU_TRAPS)
-#define TARGET_SOFT_FLOAT		(target_flags & ARM_FLAG_SOFT_FLOAT)
-#define TARGET_HARD_FLOAT		(! TARGET_SOFT_FLOAT)
-#define TARGET_CIRRUS			(arm_is_cirrus)
-#define TARGET_ANY_HARD_FLOAT		(TARGET_HARD_FLOAT || TARGET_CIRRUS)
+#define TARGET_SOFT_FLOAT		(arm_float_abi == ARM_FLOAT_ABI_SOFT)
+#define TARGET_SOFT_FLOAT_ABI		(arm_float_abi != ARM_FLOAT_ABI_HARD)
+#define TARGET_HARD_FLOAT		(arm_float_abi == ARM_FLOAT_ABI_HARD)
+#define TARGET_FPA			(arm_fp_model == ARM_FP_MODEL_FPA)
+#define TARGET_MAVERICK			(arm_fp_model == ARM_FP_MODEL_MAVERICK)
+#define TARGET_VFP			(arm_fp_model == ARM_FP_MODEL_VFP)
 #define TARGET_IWMMXT			(arm_arch_iwmmxt)
 #define TARGET_REALLY_IWMMXT		(TARGET_IWMMXT && TARGET_ARM)
-#define TARGET_VFP			(target_flags & ARM_FLAG_VFP)
 #define TARGET_BIG_END			(target_flags & ARM_FLAG_BIG_END)
 #define TARGET_INTERWORK		(target_flags & ARM_FLAG_INTERWORK)
 #define TARGET_LITTLE_WORDS		(target_flags & ARM_FLAG_LITTLE_WORDS)
@@ -523,20 +561,23 @@ extern GTY(()) rtx aof_pic_label;
   {"",				TARGET_DEFAULT, "" }			   \
 }
 
-#define TARGET_OPTIONS						\
-{								\
-  {"cpu=",  & arm_select[0].string,				\
-   N_("Specify the name of the target CPU"), 0},		\
-  {"arch=", & arm_select[1].string,				\
-   N_("Specify the name of the target architecture"), 0}, 	\
-  {"tune=", & arm_select[2].string, "", 0}, 			\
-  {"fpe=",  & target_fp_name, "" , 0}, 				\
-  {"fp=",   & target_fp_name,					\
-   N_("Specify the version of the floating point emulator"), 0},\
-  {"structure-size-boundary=", & structure_size_string, 	\
-   N_("Specify the minimum bit alignment of structures"), 0}, 	\
-  {"pic-register=", & arm_pic_register_string,			\
-   N_("Specify the register to be used for PIC addressing"), 0}	\
+#define TARGET_OPTIONS							\
+{									\
+  {"cpu=",  & arm_select[0].string,					\
+   N_("Specify the name of the target CPU"), 0},			\
+  {"arch=", & arm_select[1].string,					\
+   N_("Specify the name of the target architecture"), 0},		\
+  {"tune=", & arm_select[2].string, "", 0},				\
+  {"fpe=",  & target_fpe_name, "", 0},					\
+  {"fp=",  & target_fpe_name, "", 0},					\
+  {"fpu=",  & target_fpu_name,						\
+   N_("Specify the name of the target floating point hardware/format"), 0}, \
+  {"float-abi=", & target_float_abi_name,				\
+   N_("Specify if floating point hardware should be used"), 0},		\
+  {"structure-size-boundary=", & structure_size_string,			\
+   N_("Specify the minimum bit alignment of structures"), 0},		\
+  {"pic-register=", & arm_pic_register_string,				\
+   N_("Specify the register to be used for PIC addressing"), 0}		\
 }
 
 /* Support for a compile-time default CPU, et cetera.  The rules are:
@@ -545,13 +586,16 @@ extern GTY(()) rtx aof_pic_label;
     by --with-arch.
    --with-tune is ignored if -mtune or -mcpu are specified (but not affected
      by -march).
-   --with-float is ignored if -mhard-float or -msoft-float are
-    specified.  */
+   --with-float is ignored if -mhard-float, -msoft-float or -mfloat-abi are
+   specified.
+   --with-fpu is ignored if -mfpu is specified.  */
 #define OPTION_DEFAULT_SPECS \
   {"arch", "%{!march=*:%{!mcpu=*:-march=%(VALUE)}}" }, \
   {"cpu", "%{!march=*:%{!mcpu=*:-mcpu=%(VALUE)}}" }, \
   {"tune", "%{!mcpu=*:%{!mtune=*:-mtune=%(VALUE)}}" }, \
-  {"float", "%{!msoft-float:%{!mhard-float:-m%(VALUE)-float}}" }
+  {"float", \
+    "%{!msoft-float:%{!mhard-float:%{!mfloat-abi=*:-mfloat-abi=%(VALUE)}}}" }, \
+  {"fpu", "%{!mfpu=*:-mfpu=%(VALUE)}"},
 
 struct arm_cpu_select
 {
@@ -576,12 +620,26 @@ enum prog_mode_type
 
 extern enum prog_mode_type arm_prgmode;
 
-/* What sort of floating point unit do we have? Hardware or software.
-   If software, is it issue 2 or issue 3?  */
+/* Which floating point model to use.  */
+enum arm_fp_model
+{
+  ARM_FP_MODEL_UNKNOWN,
+  /* FPA model (Hardware or software).  */
+  ARM_FP_MODEL_FPA,
+  /* Cirrus Maverick floating point model.  */
+  ARM_FP_MODEL_MAVERICK,
+  /* VFP floating point model.  */
+  ARM_FP_MODEL_VFP
+};
+
+extern enum arm_fp_model arm_fp_model;
+
+/* Which floating point hardware is available.  Also update
+   fp_model_for_fpu in arm.c when adding entries to this list.  */
 enum fputype
 {
-  /* Software floating point, FPA style double fmt.  */
-  FPUTYPE_SOFT_FPA,
+  /* No FP hardware.  */
+  FPUTYPE_NONE,
   /* Full FPA support.  */
   FPUTYPE_FPA,
   /* Emulated FPA hardware, Issue 2 emulator (no LFM/SFM).  */
@@ -589,7 +647,9 @@ enum fputype
   /* Emulated FPA hardware, Issue 3 emulator.  */
   FPUTYPE_FPA_EMU3,
   /* Cirrus Maverick floating point co-processor.  */
-  FPUTYPE_MAVERICK
+  FPUTYPE_MAVERICK,
+  /* VFP.  */
+  FPUTYPE_VFP
 };
 
 /* Recast the floating point class to be the floating point attribute.  */
@@ -601,8 +661,21 @@ extern enum fputype arm_fpu_tune;
 /* What type of floating point instructions are available */
 extern enum fputype arm_fpu_arch;
 
+enum float_abi_type
+{
+  ARM_FLOAT_ABI_SOFT,
+  ARM_FLOAT_ABI_SOFTFP,
+  ARM_FLOAT_ABI_HARD
+};
+
+extern enum float_abi_type arm_float_abi;
+
 /* Default floating point architecture.  Override in sub-target if
-   necessary.  */
+   necessary.
+   FIXME: Is this still neccessary/desirable?  Do we want VFP chips to
+   default to VFP unless overridden by a subtarget?  If so it would be best
+   to remove these definitions.  It also assumes there is only one cpu model
+   with a Maverick fpu.  */
 #ifndef FPUTYPE_DEFAULT
 #define FPUTYPE_DEFAULT FPUTYPE_FPA_EMU2
 #endif
@@ -612,19 +685,21 @@ extern enum fputype arm_fpu_arch;
 #define FPUTYPE_DEFAULT FPUTYPE_MAVERICK
 #endif
 
-/* Nonzero if the processor has a fast multiply insn, and one that does
-   a 64-bit multiply of two 32-bit values.  */
-extern int arm_fast_multiply;
+/* Nonzero if this chip supports the ARM Architecture 3M extensions.  */
+extern int arm_arch3m;
 
-/* Nonzero if this chip supports the ARM Architecture 4 extensions */
+/* Nonzero if this chip supports the ARM Architecture 4 extensions.  */
 extern int arm_arch4;
 
-/* Nonzero if this chip supports the ARM Architecture 5 extensions */
+/* Nonzero if this chip supports the ARM Architecture 5 extensions.  */
 extern int arm_arch5;
 
-/* Nonzero if this chip supports the ARM Architecture 5E extensions */
+/* Nonzero if this chip supports the ARM Architecture 5E extensions.  */
 extern int arm_arch5e;
 
+/* Nonzero if this chip supports the ARM Architecture 6 extensions.  */
+extern int arm_arch6;
+
 /* Nonzero if this chip can benefit from load scheduling.  */
 extern int arm_ld_sched;
 
@@ -871,6 +946,11 @@ extern const char * structure_size_string;
 	mvf1-mvf3	Cirrus floating point scratch
 	mvf4-mvf15   S	Cirrus floating point variable.  */
 
+/*	s0-s15		VFP scratch (aka d0-d7).
+	s16-s31	      S	VFP variable (aka d8-d15).
+	vfpcc		Not a real register.  Represents the VFP condition
+			code flags.  */
+
 /* The stack backtrace structure is as follows:
   fp points to here:  |  save code pointer  |      [fp]
                       |  return link value  |      [fp, #-4]
@@ -895,17 +975,22 @@ extern const char * structure_size_string;
 
 /* 1 for registers that have pervasive standard uses
    and are not available for the register allocator.  */
-#define FIXED_REGISTERS  \
-{                        \
-  0,0,0,0,0,0,0,0,	 \
-  0,0,0,0,0,1,0,1,	 \
-  0,0,0,0,0,0,0,0,	 \
+#define FIXED_REGISTERS \
+{                       \
+  0,0,0,0,0,0,0,0,	\
+  0,0,0,0,0,1,0,1,	\
+  0,0,0,0,0,0,0,0,	\
   1,1,1,		\
   1,1,1,1,1,1,1,1,	\
-  1,1,1,1,1,1,1,1,	 \
-  1,1,1,1,1,1,1,1,	 \
-  1,1,1,1,1,1,1,1,	 \
-  1,1,1,1		 \
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,		\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1,1,1,1,1,1,1,1,	\
+  1			\
 }
 
 /* 1 for registers not available across function calls.
@@ -926,7 +1011,12 @@ extern const char * structure_size_string;
   1,1,1,1,1,1,1,1,	     \
   1,1,1,1,1,1,1,1,	     \
   1,1,1,1,1,1,1,1,	     \
-  1,1,1,1		     \
+  1,1,1,1,		     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1,1,1,1,1,1,1,1,	     \
+  1			     \
 }
 
 #ifndef SUBTARGET_CONDITIONAL_REGISTER_USAGE
@@ -937,10 +1027,10 @@ extern const char * structure_size_string;
 {								\
   int regno;							\
 								\
-  if (TARGET_SOFT_FLOAT || TARGET_THUMB)			\
+  if (TARGET_SOFT_FLOAT || TARGET_THUMB || !TARGET_FPA)		\
     {								\
-      for (regno = FIRST_ARM_FP_REGNUM;				\
-	   regno <= LAST_ARM_FP_REGNUM; ++regno)		\
+      for (regno = FIRST_FPA_REGNUM;				\
+	   regno <= LAST_FPA_REGNUM; ++regno)			\
 	fixed_regs[regno] = call_used_regs[regno] = 1;		\
     }								\
 								\
@@ -960,16 +1050,28 @@ extern const char * structure_size_string;
   if (TARGET_THUMB)						\
     fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;	\
 								\
-  if (TARGET_CIRRUS)						\
+  if (TARGET_ARM && TARGET_HARD_FLOAT)				\
     {								\
-      for (regno = FIRST_ARM_FP_REGNUM;				\
-	   regno <= LAST_ARM_FP_REGNUM; ++ regno)		\
-	fixed_regs[regno] = call_used_regs[regno] = 1;		\
-      for (regno = FIRST_CIRRUS_FP_REGNUM;			\
-	   regno <= LAST_CIRRUS_FP_REGNUM; ++ regno)		\
+      if (TARGET_MAVERICK)					\
 	{							\
-	  fixed_regs[regno] = 0;				\
-	  call_used_regs[regno] = regno < FIRST_CIRRUS_FP_REGNUM + 4; \
+	  for (regno = FIRST_FPA_REGNUM;			\
+	       regno <= LAST_FPA_REGNUM; ++ regno)		\
+	    fixed_regs[regno] = call_used_regs[regno] = 1;	\
+	  for (regno = FIRST_CIRRUS_FP_REGNUM;			\
+	       regno <= LAST_CIRRUS_FP_REGNUM; ++ regno)	\
+	    {							\
+	      fixed_regs[regno] = 0;				\
+	      call_used_regs[regno] = regno < FIRST_CIRRUS_FP_REGNUM + 4; \
+	    }							\
+	}							\
+      if (TARGET_VFP)						\
+	{							\
+	  for (regno = FIRST_VFP_REGNUM;			\
+	       regno <= LAST_VFP_REGNUM; ++ regno)		\
+	    {							\
+	      fixed_regs[regno] = 0;				\
+	      call_used_regs[regno] = regno < FIRST_VFP_REGNUM + 16; \
+	    }							\
 	}							\
     }								\
 								\
@@ -1031,7 +1133,8 @@ extern const char * structure_size_string;
 /* Convert fron bytes to ints.  */
 #define ARM_NUM_INTS(X) (((X) + UNITS_PER_WORD - 1) / UNITS_PER_WORD)
 
-/* The number of (integer) registers required to hold a quantity of type MODE.  */
+/* The number of (integer) registers required to hold a quantity of type MODE.
+   Also used for VFP registers.  */
 #define ARM_NUM_REGS(MODE)				\
   ARM_NUM_INTS (GET_MODE_SIZE (MODE))
 
@@ -1096,8 +1199,8 @@ extern const char * structure_size_string;
 #define STACK_POINTER_REGNUM	SP_REGNUM
 
 /* ARM floating pointer registers.  */
-#define FIRST_ARM_FP_REGNUM 	16
-#define LAST_ARM_FP_REGNUM  	23
+#define FIRST_FPA_REGNUM 	16
+#define LAST_FPA_REGNUM  	23
 
 #define FIRST_IWMMXT_GR_REGNUM	43
 #define LAST_IWMMXT_GR_REGNUM	46
@@ -1119,10 +1222,16 @@ extern const char * structure_size_string;
 #define IS_CIRRUS_REGNUM(REGNUM) \
   (((REGNUM) >= FIRST_CIRRUS_FP_REGNUM) && ((REGNUM) <= LAST_CIRRUS_FP_REGNUM))
 
+#define FIRST_VFP_REGNUM	63
+#define LAST_VFP_REGNUM		94
+#define IS_VFP_REGNUM(REGNUM) \
+  (((REGNUM) >= FIRST_VFP_REGNUM) && ((REGNUM) <= LAST_VFP_REGNUM))
+
 /* The number of hard registers is 16 ARM + 8 FPA + 1 CC + 1 SFP + 1 AFP.  */
 /* + 16 Cirrus registers take us up to 43.  */
 /* Intel Wireless MMX Technology registers add 16 + 4 more.  */
-#define FIRST_PSEUDO_REGISTER   63
+/* VFP adds 32 + 1 more.  */
+#define FIRST_PSEUDO_REGISTER   96
 
 /* Value should be nonzero if functions must have frame pointers.
    Zero means the frame pointer need not be set up (and parms may be accessed
@@ -1143,9 +1252,10 @@ extern const char * structure_size_string;
    mode.  */
 #define HARD_REGNO_NREGS(REGNO, MODE)  	\
   ((TARGET_ARM 				\
-    && REGNO >= FIRST_ARM_FP_REGNUM	\
+    && REGNO >= FIRST_FPA_REGNUM	\
     && REGNO != FRAME_POINTER_REGNUM	\
     && REGNO != ARG_POINTER_REGNUM)	\
+    && !IS_VFP_REGNUM (REGNO)		\
    ? 1 : ARM_NUM_REGS (MODE))
 
 /* Return true if REGNO is suitable for holding a quantity of type MODE.  */
@@ -1171,6 +1281,7 @@ extern const char * structure_size_string;
    clobber it anyway.  Allocate r0 through r3 in reverse order since r3 is 
    least likely to contain a function parameter; in addition results are
    returned in r0.  */
+
 #define REG_ALLOC_ORDER  	    \
 {                                   \
      3,  2,  1,  0, 12, 14,  4,  5, \
@@ -1181,7 +1292,12 @@ extern const char * structure_size_string;
     43, 44, 45, 46, 47, 48, 49, 50, \
     51, 52, 53, 54, 55, 56, 57, 58, \
     59, 60, 61, 62,		    \
-    24, 25, 26			    \
+    24, 25, 26,			    \
+    78, 77, 76, 75, 74, 73, 72, 71, \
+    70, 69, 68, 67, 66, 65, 64, 63, \
+    79, 80, 81, 82, 83, 84, 85, 86, \
+    87, 88, 89, 90, 91, 92, 93, 94, \
+    95				    \
 }
 
 /* Interrupt functions can only use registers that have already been
@@ -1200,6 +1316,7 @@ enum reg_class
   NO_REGS,
   FPA_REGS,
   CIRRUS_REGS,
+  VFP_REGS,
   IWMMXT_GR_REGS,
   IWMMXT_REGS,
   LO_REGS,
@@ -1207,6 +1324,7 @@ enum reg_class
   BASE_REGS,
   HI_REGS,
   CC_REG,
+  VFPCC_REG,
   GENERAL_REGS,
   ALL_REGS,
   LIM_REG_CLASSES
@@ -1220,6 +1338,7 @@ enum reg_class
   "NO_REGS",		\
   "FPA_REGS",		\
   "CIRRUS_REGS",	\
+  "VFP_REGS",		\
   "IWMMXT_GR_REGS",	\
   "IWMMXT_REGS",	\
   "LO_REGS",		\
@@ -1227,6 +1346,7 @@ enum reg_class
   "BASE_REGS",		\
   "HI_REGS",		\
   "CC_REG",		\
+  "VFPCC_REG"		\
   "GENERAL_REGS",	\
   "ALL_REGS",		\
 }
@@ -1234,20 +1354,22 @@ enum reg_class
 /* Define which registers fit in which classes.
    This is an initializer for a vector of HARD_REG_SET
    of length N_REG_CLASSES.  */
-#define REG_CLASS_CONTENTS  		\
-{					\
-  { 0x00000000, 0x0 },        /* NO_REGS  */	\
-  { 0x00FF0000, 0x0 },        /* FPA_REGS */	\
-  { 0xF8000000, 0x000007FF }, /* CIRRUS_REGS */	\
-  { 0x00000000, 0x00007800 }, /* IWMMXT_GR_REGS */\
-  { 0x00000000, 0x7FFF8000 }, /* IWMMXT_REGS */	\
-  { 0x000000FF, 0x0 },        /* LO_REGS */	\
-  { 0x00002000, 0x0 },        /* STACK_REG */	\
-  { 0x000020FF, 0x0 },        /* BASE_REGS */	\
-  { 0x0000FF00, 0x0 },        /* HI_REGS */	\
-  { 0x01000000, 0x0 },        /* CC_REG */	\
-  { 0x0200FFFF, 0x0 },        /* GENERAL_REGS */\
-  { 0xFAFFFFFF, 0x7FFFFFFF }  /* ALL_REGS */	\
+#define REG_CLASS_CONTENTS					\
+{								\
+  { 0x00000000, 0x00000000, 0x00000000 }, /* NO_REGS  */	\
+  { 0x00FF0000, 0x00000000, 0x00000000 }, /* FPA_REGS */	\
+  { 0xF8000000, 0x000007FF, 0x00000000 }, /* CIRRUS_REGS */	\
+  { 0x00000000, 0x80000000, 0x7FFFFFFF }, /* VFP_REGS  */	\
+  { 0x00000000, 0x00007800, 0x00000000 }, /* IWMMXT_GR_REGS */	\
+  { 0x00000000, 0x7FFF8000, 0x00000000 }, /* IWMMXT_REGS */	\
+  { 0x000000FF, 0x00000000, 0x00000000 }, /* LO_REGS */		\
+  { 0x00002000, 0x00000000, 0x00000000 }, /* STACK_REG */	\
+  { 0x000020FF, 0x00000000, 0x00000000 }, /* BASE_REGS */	\
+  { 0x0000FF00, 0x00000000, 0x00000000 }, /* HI_REGS */		\
+  { 0x01000000, 0x00000000, 0x00000000 }, /* CC_REG */		\
+  { 0x00000000, 0x00000000, 0x80000000 }, /* VFPCC_REG */	\
+  { 0x0200FFFF, 0x00000000, 0x00000000 }, /* GENERAL_REGS */	\
+  { 0xFAFFFFFF, 0xFFFFFFFF, 0x7FFFFFFF }  /* ALL_REGS */	\
 }
 
 /* The same information, inverted:
@@ -1256,11 +1378,14 @@ enum reg_class
    or could index an array.  */
 #define REGNO_REG_CLASS(REGNO)  arm_regno_class (REGNO)
 
-/* FPA registers can't do dubreg as all values are reformatted to internal
-   precision.  */
+/* FPA registers can't do subreg as all values are reformatted to internal
+   precision.  VFP registers may only be accesed in the mode they
+   were set.  */
 #define CANNOT_CHANGE_MODE_CLASS(FROM, TO, CLASS)	\
   (GET_MODE_SIZE (FROM) != GET_MODE_SIZE (TO)		\
-   ? reg_classes_intersect_p (FPA_REGS, (CLASS)) : 0)
+   ? reg_classes_intersect_p (FPA_REGS, (CLASS))	\
+     || reg_classes_intersect_p (VFP_REGS, (CLASS))	\
+   : 0)
 
 /* The class value for index registers, and the one for base regs.  */
 #define INDEX_REG_CLASS  (TARGET_THUMB ? LO_REGS : GENERAL_REGS)
@@ -1287,6 +1412,7 @@ enum reg_class
 #define REG_CLASS_FROM_LETTER(C)  	\
   (  (C) == 'f' ? FPA_REGS		\
    : (C) == 'v' ? CIRRUS_REGS		\
+   : (C) == 'w' ? VFP_REGS		\
    : (C) == 'y' ? IWMMXT_REGS		\
    : (C) == 'z' ? IWMMXT_GR_REGS	\
    : (C) == 'l' ? (TARGET_ARM ? GENERAL_REGS : LO_REGS)	\
@@ -1331,10 +1457,10 @@ enum reg_class
   (TARGET_ARM ?								\
    CONST_OK_FOR_ARM_LETTER (VALUE, C) : CONST_OK_FOR_THUMB_LETTER (VALUE, C))
      
-/* Constant letter 'G' for the FPA immediate constants. 
+/* Constant letter 'G' for the FP immediate constants.
    'H' means the same constant negated.  */
 #define CONST_DOUBLE_OK_FOR_ARM_LETTER(X, C)			\
-    ((C) == 'G' ? const_double_rtx_ok_for_fpa (X) :		\
+    ((C) == 'G' ? arm_const_double_rtx (X) :			\
      (C) == 'H' ? neg_const_double_rtx_ok_for_fpa (X) : 0)
 
 #define CONST_DOUBLE_OK_FOR_LETTER_P(X, C)			\
@@ -1345,7 +1471,8 @@ enum reg_class
    an offset from a register.  
    `S' means any symbol that has the SYMBOL_REF_FLAG set or a CONSTANT_POOL
    address.  This means that the symbol is in the text segment and can be
-   accessed without using a load.  */
+   accessed without using a load.
+   'U' is an address valid for VFP load/store insns.  */
 
 #define EXTRA_CONSTRAINT_ARM(OP, C)					    \
   ((C) == 'Q' ? GET_CODE (OP) == MEM && GET_CODE (XEXP (OP, 0)) == REG :    \
@@ -1354,6 +1481,7 @@ enum reg_class
 		 && CONSTANT_POOL_ADDRESS_P (XEXP (OP, 0))) :		    \
    (C) == 'S' ? (optimize > 0 && CONSTANT_ADDRESS_P (OP)) :		    \
    (C) == 'T' ? cirrus_memory_offset (OP) : 		    		    \
+   (C) == 'U' ? vfp_mem_operand (OP) :					    \
    0)
 
 #define EXTRA_CONSTRAINT_THUMB(X, C)					\
@@ -1364,6 +1492,8 @@ enum reg_class
   (TARGET_ARM ?								\
    EXTRA_CONSTRAINT_ARM (X, C) : EXTRA_CONSTRAINT_THUMB (X, C))
 
+#define EXTRA_MEMORY_CONSTRAINT(C, STR) ((C) == 'U')
+
 /* Given an rtx X being reloaded into a reg required to be
    in class CLASS, return the class of reg to actually use.
    In general this is just CLASS, but for the Thumb we prefer
@@ -1391,15 +1521,23 @@ enum reg_class
    or out of a register in CLASS in MODE.  If it can be done directly,
    NO_REGS is returned.  */
 #define SECONDARY_OUTPUT_RELOAD_CLASS(CLASS, MODE, X)		\
-  (TARGET_ARM ?							\
-   (((MODE) == HImode && ! arm_arch4 && true_regnum (X) == -1)	\
+  /* Restrict which direct reloads are allowed for VFP regs.  */ \
+  ((TARGET_VFP && TARGET_HARD_FLOAT				\
+    && (CLASS) == VFP_REGS)					\
+   ? vfp_secondary_reload_class (MODE, X)			\
+   : TARGET_ARM							\
+   ? (((MODE) == HImode && ! arm_arch4 && true_regnum (X) == -1) \
     ? GENERAL_REGS : NO_REGS)					\
    : THUMB_SECONDARY_OUTPUT_RELOAD_CLASS (CLASS, MODE, X))
    
 /* If we need to load shorts byte-at-a-time, then we need a scratch.  */
 #define SECONDARY_INPUT_RELOAD_CLASS(CLASS, MODE, X)		\
+  /* Restrict which direct reloads are allowed for VFP regs.  */ \
+  ((TARGET_VFP && TARGET_HARD_FLOAT				\
+    && (CLASS) == VFP_REGS)					\
+    ? vfp_secondary_reload_class (MODE, X) :			\
   /* Cannot load constants into Cirrus registers.  */		\
-  ((TARGET_CIRRUS						\
+   (TARGET_MAVERICK && TARGET_HARD_FLOAT			\
      && (CLASS) == CIRRUS_REGS					\
      && (CONSTANT_P (X) || GET_CODE (X) == SYMBOL_REF))		\
     ? GENERAL_REGS :						\
@@ -1433,13 +1571,14 @@ enum reg_class
 	  HOST_WIDE_INT val = INTVAL (XEXP (X, 1));			   \
 	  HOST_WIDE_INT low, high;					   \
 									   \
-	  if (MODE == DImode || (TARGET_SOFT_FLOAT && MODE == DFmode))	   \
+	  if (MODE == DImode || (TARGET_SOFT_FLOAT && TARGET_FPA	   \
+		                 && MODE == DFmode))			   \
 	    low = ((val & 0xf) ^ 0x8) - 0x8;				   \
-	  else if (TARGET_CIRRUS)					   \
+	  else if (TARGET_MAVERICK && TARGET_HARD_FLOAT)		   \
 	    /* Need to be careful, -256 is not a valid offset.  */	   \
 	    low = val >= 0 ? (val & 0xff) : -((-val) & 0xff);		   \
 	  else if (MODE == SImode					   \
-		   || (MODE == SFmode && TARGET_SOFT_FLOAT)		   \
+		   || (MODE == SFmode && TARGET_SOFT_FLOAT && TARGET_FPA)  \
 		   || ((MODE == HImode || MODE == QImode) && ! arm_arch4)) \
 	    /* Need to be careful, -4096 is not a valid offset.  */	   \
 	    low = val >= 0 ? (val & 0xfff) : -((-val) & 0xfff);		   \
@@ -1447,7 +1586,7 @@ enum reg_class
 	    /* Need to be careful, -256 is not a valid offset.  */	   \
 	    low = val >= 0 ? (val & 0xff) : -((-val) & 0xff);		   \
 	  else if (GET_MODE_CLASS (MODE) == MODE_FLOAT			   \
-		   && TARGET_HARD_FLOAT)				   \
+		   && TARGET_HARD_FLOAT && TARGET_FPA)			   \
 	    /* Need to be careful, -1024 is not a valid offset.  */	   \
 	    low = val >= 0 ? (val & 0x3ff) : -((-val) & 0x3ff);		   \
 	  else								   \
@@ -1520,6 +1659,8 @@ enum reg_class
   (TARGET_ARM ?						\
    ((FROM) == FPA_REGS && (TO) != FPA_REGS ? 20 :	\
     (FROM) != FPA_REGS && (TO) == FPA_REGS ? 20 :	\
+    (FROM) == VFP_REGS && (TO) != VFP_REGS ? 10 :  \
+    (FROM) != VFP_REGS && (TO) == VFP_REGS ? 10 :  \
     (FROM) == IWMMXT_REGS && (TO) != IWMMXT_REGS ? 4 :  \
     (FROM) != IWMMXT_REGS && (TO) == IWMMXT_REGS ? 4 :  \
     (FROM) == IWMMXT_GR_REGS || (TO) == IWMMXT_GR_REGS ? 20 :  \
@@ -1575,9 +1716,11 @@ enum reg_class
 /* Define how to find the value returned by a library function
    assuming the value has mode MODE.  */
 #define LIBCALL_VALUE(MODE)  \
-  (TARGET_ARM && TARGET_HARD_FLOAT && GET_MODE_CLASS (MODE) == MODE_FLOAT \
-   ? gen_rtx_REG (MODE, FIRST_ARM_FP_REGNUM) \
-   : TARGET_ARM && TARGET_CIRRUS && GET_MODE_CLASS (MODE) == MODE_FLOAT \
+  (TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA			\
+   && GET_MODE_CLASS (MODE) == MODE_FLOAT				\
+   ? gen_rtx_REG (MODE, FIRST_FPA_REGNUM)				\
+   : TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK			\
+     && GET_MODE_CLASS (MODE) == MODE_FLOAT				\
    ? gen_rtx_REG (MODE, FIRST_CIRRUS_FP_REGNUM) 			\
    : TARGET_REALLY_IWMMXT && VECTOR_MODE_SUPPORTED_P (MODE)		\
    ? gen_rtx_REG (MODE, FIRST_IWMMXT_REGNUM) 				\
@@ -1595,9 +1738,11 @@ enum reg_class
 /* On a Cirrus chip, mvf0 can return results.  */
 #define FUNCTION_VALUE_REGNO_P(REGNO)  \
   ((REGNO) == ARG_REGISTER (1) \
-   || (TARGET_ARM && ((REGNO) == FIRST_CIRRUS_FP_REGNUM) && TARGET_CIRRUS) \
+   || (TARGET_ARM && ((REGNO) == FIRST_CIRRUS_FP_REGNUM)		\
+       && TARGET_HARD_FLOAT && TARGET_MAVERICK)				\
    || (TARGET_ARM && ((REGNO) == FIRST_IWMMXT_REGNUM) && TARGET_IWMMXT) \
-   || (TARGET_ARM && ((REGNO) == FIRST_ARM_FP_REGNUM) && TARGET_HARD_FLOAT))
+   || (TARGET_ARM && ((REGNO) == FIRST_FPA_REGNUM)			\
+       && TARGET_HARD_FLOAT && TARGET_FPA))
 
 /* How large values are returned */
 /* A C expression which can inhibit the returning of certain function values
@@ -2081,6 +2226,7 @@ typedef struct
 #define THUMB_LEGITIMATE_CONSTANT_P(X)	\
  (   GET_CODE (X) == CONST_INT		\
   || GET_CODE (X) == CONST_DOUBLE	\
+  || GET_CODE (X) == CONSTANT_P_RTX     \
   || CONSTANT_ADDRESS_P (X)		\
   || flag_pic)
 
@@ -2433,10 +2579,11 @@ extern int making_const_table;
     {							\
       if (TARGET_THUMB) 				\
         {						\
-          if (is_called_in_ARM_mode (DECL))		\
+          if (is_called_in_ARM_mode (DECL)      \
+			  || current_function_is_thunk)		\
             fprintf (STREAM, "\t.code 32\n") ;		\
           else						\
-           fprintf (STREAM, "\t.thumb_func\n") ;	\
+           fprintf (STREAM, "\t.code 16\n\t.thumb_func\n") ;	\
         }						\
       if (TARGET_POKE_FUNCTION_NAME)			\
         arm_poke_function_name (STREAM, (char *) NAME);	\
@@ -2665,12 +2812,13 @@ extern int making_const_table;
 /* Define the codes that are matched by predicates in arm.c */
 #define PREDICATE_CODES							\
   {"s_register_operand", {SUBREG, REG}},				\
+  {"arm_general_register_operand", {SUBREG, REG}},			\
   {"arm_hard_register_operand", {REG}},					\
   {"f_register_operand", {SUBREG, REG}},				\
   {"arm_add_operand",    {SUBREG, REG, CONST_INT}},			\
   {"arm_addimm_operand", {CONST_INT}},					\
-  {"fpa_add_operand",    {SUBREG, REG, CONST_DOUBLE}},			\
-  {"fpa_rhs_operand",    {SUBREG, REG, CONST_DOUBLE}},			\
+  {"arm_float_add_operand",    {SUBREG, REG, CONST_DOUBLE}},		\
+  {"arm_float_rhs_operand",    {SUBREG, REG, CONST_DOUBLE}},		\
   {"arm_rhs_operand",    {SUBREG, REG, CONST_INT}},			\
   {"arm_not_operand",    {SUBREG, REG, CONST_INT}},			\
   {"reg_or_int_operand", {SUBREG, REG, CONST_INT}},			\
@@ -2702,7 +2850,9 @@ extern int making_const_table;
   {"cirrus_register_operand", {REG}},					\
   {"cirrus_fp_register", {REG}},					\
   {"cirrus_shift_const", {CONST_INT}},					\
-  {"dominant_cc_register", {REG}},
+  {"dominant_cc_register", {REG}},					\
+  {"arm_float_compare_operand", {REG, CONST_DOUBLE}},			\
+  {"vfp_compare_operand", {REG, CONST_DOUBLE}},
 
 /* Define this if you have special predicates that know special things
    about modes.  Genrecog will warn about certain forms of
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index d05c6980659..cafac0b58e4 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -145,7 +145,7 @@
 ; Floating Point Unit.  If we only have floating point emulation, then there
 ; is no point in scheduling the floating point insns.  (Well, for best
 ; performance we should try and group them together).
-(define_attr "fpu" "softfpa,fpa,fpe2,fpe3,maverick"
+(define_attr "fpu" "none,fpa,fpe2,fpe3,maverick,vfp"
   (const (symbol_ref "arm_fpu_attr")))
 
 ; LENGTH of an instruction (in bytes)
@@ -167,13 +167,26 @@
   (set_attr "length" "4")
   (set_attr "pool_range" "250")])
 
+;; The instruction used to implement a particular pattern.  This
+;; information is used by pipeline descriptions to provide accurate
+;; scheduling information.
+
+(define_attr "insn"
+        "smulxy,smlaxy,smlalxy,smulwy,smlawx,mul,muls,mla,mlas,umull,umulls,umlal,umlals,smull,smulls,smlal,smlals,smlawy,smuad,smuadx,smlad,smladx,smusd,smusdx,smlsd,smlsdx,smmul,smmulr,other"
+        (const_string "other"))
+
 ; TYPE attribute is used to detect floating point instructions which, if
 ; running on a co-processor can run in parallel with other, basic instructions
 ; If write-buffer scheduling is enabled then it can also be used in the
 ; scheduling of writes.
 
 ; Classification of each insn
-; normal	any data instruction that doesn't hit memory or fp regs
+; alu		any alu  instruction that doesn't hit memory or fp
+;		regs or have a shifted source operand
+; alu_shift	any data instruction that doesn't hit memory or fp
+;		regs, but has a source operand shifted by a constant
+; alu_shift_reg	any data instruction that doesn't hit memory or fp
+;		regs, but has a source operand shifted by a register value
 ; mult		a multiply instruction
 ; block		blockage insn, this blocks all functional units
 ; float		a floating point arithmetic operation (subject to expansion)
@@ -191,19 +204,27 @@
 ; r_mem_f	the reverse of f_mem_r
 ; f_2_r		fast transfer float to arm (no memory needed)
 ; r_2_f		fast transfer arm to float
+; branch	a branch
 ; call		a subroutine call
-; load		any load from memory
-; store1	store 1 word to memory from arm registers
+; load_byte	load byte(s) from memory to arm registers
+; load1		load 1 word from memory to arm registers
+; load2         load 2 words from memory to arm registers
+; load3         load 3 words from memory to arm registers
+; load4         load 4 words from memory to arm registers
+; store		store 1 word to memory from arm registers
 ; store2	store 2 words
 ; store3	store 3 words
-; store4	store 4 words
+; store4	store 4 (or more) words
 ;  Additions for Cirrus Maverick co-processor:
 ; mav_farith	Floating point arithmetic (4 cycle)
 ; mav_dmult	Double multiplies (7 cycle)
 ;
 (define_attr "type"
-	"normal,mult,block,float,fdivx,fdivd,fdivs,fmul,ffmul,farith,ffarith,float_em,f_load,f_store,f_mem_r,r_mem_f,f_2_r,r_2_f,call,load,store1,store2,store3,store4,mav_farith,mav_dmult" 
-	(const_string "normal"))
+	"alu,alu_shift,alu_shift_reg,mult,block,float,fdivx,fdivd,fdivs,fmul,ffmul,farith,ffarith,float_em,f_load,f_store,f_mem_r,r_mem_f,f_2_r,r_2_f,branch,call,load_byte,load1,load2,load3,load4,store1,store2,store3,store4,mav_farith,mav_dmult" 
+	(if_then_else 
+	 (eq_attr "insn" "smulxy,smlaxy,smlalxy,smulwy,smlawx,mul,muls,mla,mlas,umull,umulls,umlal,umlals,smull,smulls,smlal,smlals")
+	 (const_string "mult")
+	 (const_string "alu")))
 
 ; Load scheduling, set from the arm_ld_sched variable
 ; initialized by arm_override_options() 
@@ -251,7 +272,7 @@
 ; to stall the processor.  Used with model_wbuf above.
 (define_attr "write_conflict" "no,yes"
   (if_then_else (eq_attr "type"
-		 "block,float_em,f_load,f_store,f_mem_r,r_mem_f,call,load")
+		 "block,float_em,f_load,f_store,f_mem_r,r_mem_f,call,load1")
 		(const_string "yes")
 		(const_string "no")))
 
@@ -259,7 +280,7 @@
 ; than one on the main cpu execution unit.
 (define_attr "core_cycles" "single,multi"
   (if_then_else (eq_attr "type"
-		 "normal,float,fdivx,fdivd,fdivs,fmul,ffmul,farith,ffarith")
+		 "alu,alu_shift,float,fdivx,fdivd,fdivs,fmul,ffmul,farith,ffarith")
 		(const_string "single")
 	        (const_string "multi")))
 
@@ -267,115 +288,27 @@
 ;; distant label.  Only applicable to Thumb code.
 (define_attr "far_jump" "yes,no" (const_string "no"))
 
-(define_automaton "arm")
+;;---------------------------------------------------------------------------
+;; Pipeline descriptions
 
-;; Write buffer
-;
-; Strictly, we should model a 4-deep write buffer for ARM7xx based chips
-;
-; The write buffer on some of the arm6 processors is hard to model exactly.
-; There is room in the buffer for up to two addresses and up to eight words
-; of memory, but the two needn't be split evenly.  When writing the two
-; addresses are fully pipelined.  However, a read from memory that is not
-; currently in the cache will block until the writes have completed.
-; It is normally the case that FCLK and MCLK will be in the ratio 2:1, so
-; writes will take 2 FCLK cycles per word, if FCLK and MCLK are asynchronous
-; (they aren't allowed to be at present) then there is a startup cost of 1MCLK
-; cycle to add as well.
-(define_cpu_unit "write_buf" "arm")
-
-;; Write blockage unit
-;
-; The write_blockage unit models (partially), the fact that reads will stall
-; until the write buffer empties.
-; The f_mem_r and r_mem_f could also block, but they are to the stack,
-; so we don't model them here
-(define_cpu_unit "write_blockage" "arm")
+;; Processor type.  This attribute must exactly match the table in 
+;; arm-cores.def.
+(define_attr "tune" 
+	     "arm2,arm250,arm3,arm6,arm60,arm600,arm610,arm620,arm7,arm7m,arm7d,arm7dm,arm7di,arm7dmi,arm70,arm700,arm700i,arm710,arm720,arm710c,arm7100,arm7500,arm7500fe,arm7tdmi,arm710t,arm720t,arm740t,arm8,arm810,arm9,arm920,arm920t,arm940t,arm9tdmi,arm9e,ep9312,strongarm,strongarm110,strongarm1100,strongarm1110,arm10tdmi,arm1020t,arm926ejs,arm1026ejs,xscale,iwmmxt,arm1136js,arm1136jfs"
+	     (const (symbol_ref "arm_tune")))
 
-;; Core
-;
-(define_cpu_unit "core" "arm")
-
-(define_insn_reservation "r_mem_f_wbuf" 5
-  (and (eq_attr "model_wbuf" "yes")
-       (eq_attr "type" "r_mem_f"))
-  "core+write_buf*3")
-
-(define_insn_reservation "store1_wbuf" 5
-  (and (eq_attr "model_wbuf" "yes")
-       (eq_attr "type" "store1"))
-  "core+write_buf*3+write_blockage*5")
-
-(define_insn_reservation "store2_wbuf" 7
-  (and (eq_attr "model_wbuf" "yes")
-       (eq_attr "type" "store2"))
-  "core+write_buf*4+write_blockage*7")
-
-(define_insn_reservation "store3_wbuf" 9
-  (and (eq_attr "model_wbuf" "yes")
-       (eq_attr "type" "store3"))
-  "core+write_buf*5+write_blockage*9")
-
-(define_insn_reservation "store4_wbuf" 11
-  (and (eq_attr "model_wbuf" "yes")
-       (eq_attr "type" "store4"))
-  "core+write_buf*6+write_blockage*11")
-
-(define_insn_reservation "store2" 3
-  (and (eq_attr "model_wbuf" "no")
-       (eq_attr "type" "store2"))
-  "core*3")
-
-(define_insn_reservation "store3" 4
-  (and (eq_attr "model_wbuf" "no")
-       (eq_attr "type" "store3"))
-  "core*4")
-
-(define_insn_reservation "store4" 5
-  (and (eq_attr "model_wbuf" "no")
-       (eq_attr "type" "store4"))
-  "core*5")
-
-(define_insn_reservation "store1_ldsched" 1
-  (and (eq_attr "ldsched" "yes") (eq_attr "type" "store1"))
-  "core")
-
-(define_insn_reservation "load_ldsched_xscale" 3
-  (and (and (eq_attr "ldsched" "yes") (eq_attr "type" "load"))
-       (eq_attr "is_xscale" "yes"))
-  "core")
-
-(define_insn_reservation "load_ldsched" 2
-  (and (and (eq_attr "ldsched" "yes") (eq_attr "type" "load"))
-       (eq_attr "is_xscale" "no"))
-  "core")
-
-(define_insn_reservation "load_or_store" 2
-  (and (eq_attr "ldsched" "!yes") (eq_attr "type" "load,store1"))
-  "core*2")
-
-(define_insn_reservation "mult" 16
-  (and (eq_attr "ldsched" "no") (eq_attr "type" "mult"))
-  "core*16")
-
-(define_insn_reservation "mult_ldsched_strongarm" 3
-  (and (and (eq_attr "ldsched" "yes") (eq_attr "is_strongarm" "yes"))
-       (eq_attr "type" "mult"))
-  "core*2")
-
-(define_insn_reservation "mult_ldsched" 4
-  (and (and (eq_attr "ldsched" "yes") (eq_attr "is_strongarm" "no"))
-       (eq_attr "type" "mult"))
-  "core*4")
-
-(define_insn_reservation "multi_cycle" 32
-  (and (eq_attr "core_cycles" "multi")
-       (eq_attr "type" "!mult,load,store1,store2,store3,store4"))
-  "core*32")
-
-(define_insn_reservation "single_cycle" 1
-  (eq_attr "core_cycles" "single")
-  "core")
+;; True if the generic scheduling description should be used.
+
+(define_attr "generic_sched" "yes,no"
+        (if_then_else 
+         (eq_attr "tune" "arm926ejs,arm1026ejs,arm1136js,arm1136jfs") 
+         (const_string "no")
+         (const_string "yes")))
+	
+(include "arm-generic.md")
+(include "arm926ejs.md")
+(include "arm1026ejs.md")
+(include "arm1136jfs.md")
 
 
 ;;---------------------------------------------------------------------------
@@ -397,7 +330,7 @@
     (clobber (reg:CC CC_REGNUM))])]
   "TARGET_EITHER"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_HARD_FLOAT && TARGET_MAVERICK)
     {
       if (!cirrus_fp_register (operands[0], DImode))
         operands[0] = force_reg (DImode, operands[0]);
@@ -433,7 +366,7 @@
 	(plus:DI (match_operand:DI 1 "s_register_operand" "%0, 0")
 		 (match_operand:DI 2 "s_register_operand" "r,  0")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM && !TARGET_CIRRUS"
+  "TARGET_ARM && !(TARGET_HARD_FLOAT && TARGET_MAVERICK)"
   "#"
   "TARGET_ARM && reload_completed"
   [(parallel [(set (reg:CC_C CC_REGNUM)
@@ -461,7 +394,7 @@
 		  (match_operand:SI 2 "s_register_operand" "r,r"))
 		 (match_operand:DI 1 "s_register_operand" "r,0")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM && !TARGET_CIRRUS"
+  "TARGET_ARM && !(TARGET_HARD_FLOAT && TARGET_MAVERICK)"
   "#"
   "TARGET_ARM && reload_completed"
   [(parallel [(set (reg:CC_C CC_REGNUM)
@@ -490,7 +423,7 @@
 		  (match_operand:SI 2 "s_register_operand" "r,r"))
 		 (match_operand:DI 1 "s_register_operand" "r,0")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM && !TARGET_CIRRUS"
+  "TARGET_ARM && !(TARGET_HARD_FLOAT && TARGET_MAVERICK)"
   "#"
   "TARGET_ARM && reload_completed"
   [(parallel [(set (reg:CC_C CC_REGNUM)
@@ -801,7 +734,10 @@
 		    (match_operand:SI 1 "s_register_operand" "r"))))]
   "TARGET_ARM"
   "adc%?\\t%0, %1, %3%S2"
-  [(set_attr "conds" "use")]
+  [(set_attr "conds" "use")
+   (set (attr "type") (if_then_else (match_operand 4 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*addsi3_carryin_alt1"
@@ -864,10 +800,10 @@
 (define_expand "addsf3"
   [(set (match_operand:SF          0 "s_register_operand" "")
 	(plus:SF (match_operand:SF 1 "s_register_operand" "")
-		 (match_operand:SF 2 "fpa_add_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+		 (match_operand:SF 2 "arm_float_add_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS
+  if (TARGET_MAVERICK
       && !cirrus_fp_register (operands[2], SFmode))
     operands[2] = force_reg (SFmode, operands[2]);
 ")
@@ -875,10 +811,10 @@
 (define_expand "adddf3"
   [(set (match_operand:DF          0 "s_register_operand" "")
 	(plus:DF (match_operand:DF 1 "s_register_operand" "")
-		 (match_operand:DF 2 "fpa_add_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+		 (match_operand:DF 2 "arm_float_add_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS
+  if (TARGET_MAVERICK
       && !cirrus_fp_register (operands[2], DFmode))
     operands[2] = force_reg (DFmode, operands[2]);
 ")
@@ -891,7 +827,7 @@
     (clobber (reg:CC CC_REGNUM))])]
   "TARGET_EITHER"
   "
-  if (TARGET_CIRRUS
+  if (TARGET_HARD_FLOAT && TARGET_MAVERICK
       && TARGET_ARM
       && cirrus_fp_register (operands[0], DImode)
       && cirrus_fp_register (operands[1], DImode))
@@ -1087,11 +1023,11 @@
 
 (define_expand "subsf3"
   [(set (match_operand:SF           0 "s_register_operand" "")
-	(minus:SF (match_operand:SF 1 "fpa_rhs_operand" "")
-		  (match_operand:SF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+	(minus:SF (match_operand:SF 1 "arm_float_rhs_operand" "")
+		  (match_operand:SF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
       if (!cirrus_fp_register (operands[1], SFmode))
         operands[1] = force_reg (SFmode, operands[1]);
@@ -1102,11 +1038,11 @@
 
 (define_expand "subdf3"
   [(set (match_operand:DF           0 "s_register_operand" "")
-	(minus:DF (match_operand:DF 1 "fpa_rhs_operand"     "")
-		  (match_operand:DF 2 "fpa_rhs_operand"    "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+	(minus:DF (match_operand:DF 1 "arm_float_rhs_operand" "")
+		  (match_operand:DF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
        if (!cirrus_fp_register (operands[1], DFmode))
          operands[1] = force_reg (DFmode, operands[1]);
@@ -1133,7 +1069,7 @@
 		 (match_operand:SI 1 "s_register_operand" "%?r,0")))]
   "TARGET_ARM"
   "mul%?\\t%0, %2, %1"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "mul")
    (set_attr "predicable" "yes")]
 )
 
@@ -1154,7 +1090,7 @@
     return \"mul\\t%0, %0, %2\";
   "
   [(set_attr "length" "4,4,2")
-   (set_attr "type" "mult")]
+   (set_attr "insn" "mul")]
 )
 
 (define_insn "*mulsi3_compare0"
@@ -1168,7 +1104,7 @@
   "TARGET_ARM && !arm_arch_xscale"
   "mul%?s\\t%0, %2, %1"
   [(set_attr "conds" "set")
-   (set_attr "type" "mult")]
+   (set_attr "insn" "muls")]
 )
 
 (define_insn "*mulsi_compare0_scratch"
@@ -1181,7 +1117,7 @@
   "TARGET_ARM && !arm_arch_xscale"
   "mul%?s\\t%0, %2, %1"
   [(set_attr "conds" "set")
-   (set_attr "type" "mult")]
+   (set_attr "insn" "muls")]
 )
 
 ;; Unnamed templates to match MLA instruction.
@@ -1194,7 +1130,7 @@
 	  (match_operand:SI 3 "s_register_operand" "?r,r,0,0")))]
   "TARGET_ARM"
   "mla%?\\t%0, %2, %1, %3"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "mla")
    (set_attr "predicable" "yes")]
 )
 
@@ -1212,7 +1148,7 @@
   "TARGET_ARM && !arm_arch_xscale"
   "mla%?s\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
-   (set_attr "type" "mult")]
+   (set_attr "insn" "mlas")]
 )
 
 (define_insn "*mulsi3addsi_compare0_scratch"
@@ -1227,7 +1163,7 @@
   "TARGET_ARM && !arm_arch_xscale"
   "mla%?s\\t%0, %2, %1, %3"
   [(set_attr "conds" "set")
-   (set_attr "type" "mult")]
+   (set_attr "insn" "mlas")]
 )
 
 ;; Unnamed template to match long long multiply-accumulate (smlal)
@@ -1239,9 +1175,9 @@
 	  (sign_extend:DI (match_operand:SI 2 "s_register_operand" "%r"))
 	  (sign_extend:DI (match_operand:SI 3 "s_register_operand" "r")))
 	 (match_operand:DI 1 "s_register_operand" "0")))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "smlal%?\\t%Q0, %R0, %3, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smlal")
    (set_attr "predicable" "yes")]
 )
 
@@ -1250,9 +1186,9 @@
 	(mult:DI
 	 (sign_extend:DI (match_operand:SI 1 "s_register_operand" "%r"))
 	 (sign_extend:DI (match_operand:SI 2 "s_register_operand" "r"))))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "smull%?\\t%Q0, %R0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smull")
    (set_attr "predicable" "yes")]
 )
 
@@ -1261,9 +1197,9 @@
 	(mult:DI
 	 (zero_extend:DI (match_operand:SI 1 "s_register_operand" "%r"))
 	 (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r"))))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "umull%?\\t%Q0, %R0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "umull")
    (set_attr "predicable" "yes")]
 )
 
@@ -1276,9 +1212,9 @@
 	  (zero_extend:DI (match_operand:SI 2 "s_register_operand" "%r"))
 	  (zero_extend:DI (match_operand:SI 3 "s_register_operand" "r")))
 	 (match_operand:DI 1 "s_register_operand" "0")))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "umlal%?\\t%Q0, %R0, %3, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "umlal")
    (set_attr "predicable" "yes")]
 )
 
@@ -1291,9 +1227,9 @@
 	   (sign_extend:DI (match_operand:SI 2 "s_register_operand" "r,r")))
 	  (const_int 32))))
    (clobber (match_scratch:SI 3 "=&r,&r"))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "smull%?\\t%3, %0, %2, %1"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smull")
    (set_attr "predicable" "yes")]
 )
 
@@ -1306,9 +1242,9 @@
 	   (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r,r")))
 	  (const_int 32))))
    (clobber (match_scratch:SI 3 "=&r,&r"))]
-  "TARGET_ARM && arm_fast_multiply"
+  "TARGET_ARM && arm_arch3m"
   "umull%?\\t%3, %0, %2, %1"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "umull")
    (set_attr "predicable" "yes")]
 )
 
@@ -1320,7 +1256,7 @@
 		  (match_operand:HI 2 "s_register_operand" "r"))))]
   "TARGET_ARM && arm_arch5e"
   "smulbb%?\\t%0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smulxy")
    (set_attr "predicable" "yes")]
 )
 
@@ -1333,7 +1269,7 @@
 		  (match_operand:HI 2 "s_register_operand" "r"))))]
   "TARGET_ARM && arm_arch5e"
   "smultb%?\\t%0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smulxy")
    (set_attr "predicable" "yes")]
 )
 
@@ -1346,7 +1282,7 @@
 		  (const_int 16))))]
   "TARGET_ARM && arm_arch5e"
   "smulbt%?\\t%0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smulxy")
    (set_attr "predicable" "yes")]
 )
 
@@ -1360,7 +1296,7 @@
 		  (const_int 16))))]
   "TARGET_ARM && arm_arch5e"
   "smultt%?\\t%0, %1, %2"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smulxy")
    (set_attr "predicable" "yes")]
 )
 
@@ -1373,7 +1309,7 @@
 			   (match_operand:HI 3 "s_register_operand" "r")))))]
   "TARGET_ARM && arm_arch5e"
   "smlabb%?\\t%0, %2, %3, %1"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smlaxy")
    (set_attr "predicable" "yes")]
 )
 
@@ -1387,16 +1323,16 @@
 		    (match_operand:HI 3 "s_register_operand" "r")))))]
   "TARGET_ARM && arm_arch5e"
   "smlalbb%?\\t%Q0, %R0, %2, %3"
-  [(set_attr "type" "mult")
+  [(set_attr "insn" "smlalxy")
    (set_attr "predicable" "yes")])
 
 (define_expand "mulsf3"
   [(set (match_operand:SF          0 "s_register_operand" "")
 	(mult:SF (match_operand:SF 1 "s_register_operand" "")
-		 (match_operand:SF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+		 (match_operand:SF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS
+  if (TARGET_MAVERICK
       && !cirrus_fp_register (operands[2], SFmode))
     operands[2] = force_reg (SFmode, operands[2]);
 ")
@@ -1404,10 +1340,10 @@
 (define_expand "muldf3"
   [(set (match_operand:DF          0 "s_register_operand" "")
 	(mult:DF (match_operand:DF 1 "s_register_operand" "")
-		 (match_operand:DF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+		 (match_operand:DF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS
+  if (TARGET_MAVERICK
       && !cirrus_fp_register (operands[2], DFmode))
     operands[2] = force_reg (DFmode, operands[2]);
 ")
@@ -1416,16 +1352,16 @@
 
 (define_expand "divsf3"
   [(set (match_operand:SF 0 "s_register_operand" "")
-	(div:SF (match_operand:SF 1 "fpa_rhs_operand" "")
-		(match_operand:SF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(div:SF (match_operand:SF 1 "arm_float_rhs_operand" "")
+		(match_operand:SF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "")
 
 (define_expand "divdf3"
   [(set (match_operand:DF 0 "s_register_operand" "")
-	(div:DF (match_operand:DF 1 "fpa_rhs_operand" "")
-		(match_operand:DF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(div:DF (match_operand:DF 1 "arm_float_rhs_operand" "")
+		(match_operand:DF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "")
 
 ;; Modulo insns
@@ -1433,15 +1369,15 @@
 (define_expand "modsf3"
   [(set (match_operand:SF 0 "s_register_operand" "")
 	(mod:SF (match_operand:SF 1 "s_register_operand" "")
-		(match_operand:SF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:SF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "")
 
 (define_expand "moddf3"
   [(set (match_operand:DF 0 "s_register_operand" "")
 	(mod:DF (match_operand:DF 1 "s_register_operand" "")
-		(match_operand:DF 2 "fpa_rhs_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:DF 2 "arm_float_rhs_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "")
 
 ;; Boolean and,ior,xor insns
@@ -2048,16 +1984,18 @@
 )
 
 (define_insn "andsi_not_shiftsi_si"
-  [(set (match_operand:SI                   0 "s_register_operand" "=r")
-	(and:SI (not:SI (match_operator:SI  4 "shift_operator"
-			 [(match_operand:SI 2 "s_register_operand"  "r")
-			  (match_operand:SI 3 "arm_rhs_operand"     "rM")]))
-		(match_operand:SI           1 "s_register_operand"  "r")))]
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(and:SI (not:SI (match_operator:SI 4 "shift_operator"
+			 [(match_operand:SI 2 "s_register_operand" "r")
+			  (match_operand:SI 3 "arm_rhs_operand" "rM")]))
+		(match_operand:SI 1 "s_register_operand" "r")))]
   "TARGET_ARM"
   "bic%?\\t%0, %1, %2%S4"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "2")
-   ]
+   (set (attr "type") (if_then_else (match_operand 3 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*andsi_notsi_si_compare0"
@@ -2537,6 +2475,41 @@
 
 ;; Shift and rotation insns
 
+(define_expand "ashldi3"
+  [(set (match_operand:DI            0 "s_register_operand" "")
+        (ashift:DI (match_operand:DI 1 "s_register_operand" "")
+                   (match_operand:SI 2 "reg_or_int_operand" "")))]
+  "TARGET_ARM"
+  "
+  if (GET_CODE (operands[2]) == CONST_INT)
+    {
+      if ((HOST_WIDE_INT) INTVAL (operands[2]) == 1)
+        {
+          emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1]));
+          DONE;
+        }
+        /* Ideally we shouldn't fail here if we could know that operands[1] 
+           ends up already living in an iwmmxt register. Otherwise it's
+           cheaper to have the alternate code being generated than moving
+           values to iwmmxt regs and back. */
+        FAIL;
+    }
+  else if (!TARGET_REALLY_IWMMXT && !(TARGET_HARD_FLOAT && TARGET_MAVERICK))
+    FAIL;
+  "
+)
+
+(define_insn "arm_ashldi3_1bit"
+  [(set (match_operand:DI            0 "s_register_operand" "=&r,r")
+        (ashift:DI (match_operand:DI 1 "s_register_operand" "?r,0")
+                   (const_int 1)))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARM"
+  "movs\\t%Q0, %Q1, asl #1\;adc\\t%R0, %R1, %R1"
+  [(set_attr "conds" "clob")
+   (set_attr "length" "8")]
+)
+
 (define_expand "ashlsi3"
   [(set (match_operand:SI            0 "s_register_operand" "")
 	(ashift:SI (match_operand:SI 1 "s_register_operand" "")
@@ -2561,6 +2534,41 @@
   [(set_attr "length" "2")]
 )
 
+(define_expand "ashrdi3"
+  [(set (match_operand:DI              0 "s_register_operand" "")
+        (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "")
+                     (match_operand:SI 2 "reg_or_int_operand" "")))]
+  "TARGET_ARM"
+  "
+  if (GET_CODE (operands[2]) == CONST_INT)
+    {
+      if ((HOST_WIDE_INT) INTVAL (operands[2]) == 1)
+        {
+          emit_insn (gen_arm_ashrdi3_1bit (operands[0], operands[1]));
+          DONE;
+        }
+        /* Ideally we shouldn't fail here if we could know that operands[1] 
+           ends up already living in an iwmmxt register. Otherwise it's
+           cheaper to have the alternate code being generated than moving
+           values to iwmmxt regs and back. */
+        FAIL;
+    }
+  else if (!TARGET_REALLY_IWMMXT)
+    FAIL;
+  "
+)
+
+(define_insn "arm_ashrdi3_1bit"
+  [(set (match_operand:DI              0 "s_register_operand" "=&r,r")
+        (ashiftrt:DI (match_operand:DI 1 "s_register_operand" "?r,0")
+                     (const_int 1)))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARM"
+  "movs\\t%R0, %R1, asr #1\;mov\\t%Q0, %Q1, rrx"
+  [(set_attr "conds" "clob")
+   (set_attr "length" "8")]
+)
+
 (define_expand "ashrsi3"
   [(set (match_operand:SI              0 "s_register_operand" "")
 	(ashiftrt:SI (match_operand:SI 1 "s_register_operand" "")
@@ -2582,6 +2590,41 @@
   [(set_attr "length" "2")]
 )
 
+(define_expand "lshrdi3"
+  [(set (match_operand:DI              0 "s_register_operand" "")
+        (lshiftrt:DI (match_operand:DI 1 "s_register_operand" "")
+                     (match_operand:SI 2 "reg_or_int_operand" "")))]
+  "TARGET_ARM"
+  "
+  if (GET_CODE (operands[2]) == CONST_INT)
+    {
+      if ((HOST_WIDE_INT) INTVAL (operands[2]) == 1)
+        {
+          emit_insn (gen_arm_lshrdi3_1bit (operands[0], operands[1]));
+          DONE;
+        }
+        /* Ideally we shouldn't fail here if we could know that operands[1] 
+           ends up already living in an iwmmxt register. Otherwise it's
+           cheaper to have the alternate code being generated than moving
+           values to iwmmxt regs and back. */
+        FAIL;
+    }
+  else if (!TARGET_REALLY_IWMMXT)
+    FAIL;
+  "
+)
+
+(define_insn "arm_lshrdi3_1bit"
+  [(set (match_operand:DI              0 "s_register_operand" "=&r,r")
+        (lshiftrt:DI (match_operand:DI 1 "s_register_operand" "?r,0")
+                     (const_int 1)))
+   (clobber (reg:CC CC_REGNUM))]
+  "TARGET_ARM"
+  "movs\\t%R0, %R1, lsr #1\;mov\\t%Q0, %Q1, rrx"
+  [(set_attr "conds" "clob")
+   (set_attr "length" "8")]
+)
+
 (define_expand "lshrsi3"
   [(set (match_operand:SI              0 "s_register_operand" "")
 	(lshiftrt:SI (match_operand:SI 1 "s_register_operand" "")
@@ -2652,19 +2695,6 @@
   [(set_attr "length" "2")]
 )
 
-(define_expand "ashldi3"
-  [(set (match_operand:DI            0 "s_register_operand" "")
-	(ashift:DI (match_operand:DI 1 "general_operand"    "")
-		   (match_operand:SI 2 "general_operand"    "")))]
-  "TARGET_ARM && (TARGET_IWMMXT || TARGET_CIRRUS)"
-  "
-  if (! s_register_operand (operands[1], DImode))
-    operands[1] = copy_to_mode_reg (DImode, operands[1]);
-  if (! s_register_operand (operands[2], SImode))
-    operands[2] = copy_to_mode_reg (SImode, operands[2]);
-  "
-)
-
 (define_insn "*arm_shiftsi3"
   [(set (match_operand:SI   0 "s_register_operand" "=r")
 	(match_operator:SI  3 "shift_operator"
@@ -2674,7 +2704,9 @@
   "mov%?\\t%0, %1%S3"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*shiftsi3_compare0"
@@ -2689,7 +2721,9 @@
   "mov%?s\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*shiftsi3_compare0_scratch"
@@ -2702,8 +2736,7 @@
   "TARGET_ARM"
   "mov%?s\\t%0, %1%S3"
   [(set_attr "conds" "set")
-   (set_attr "shift" "1")
-   ]
+   (set_attr "shift" "1")]
 )
 
 (define_insn "*notsi_shiftsi"
@@ -2715,7 +2748,9 @@
   "mvn%?\\t%0, %1%S3"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*notsi_shiftsi_compare0"
@@ -2730,7 +2765,9 @@
   "mvn%?s\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*not_shiftsi_compare0_scratch"
@@ -2744,7 +2781,9 @@
   "mvn%?s\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-  ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 ;; We don't really have extzv, but defining this using shifts helps
@@ -2841,14 +2880,14 @@
 (define_expand "negsf2"
   [(set (match_operand:SF         0 "s_register_operand" "")
 	(neg:SF (match_operand:SF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   ""
 )
 
 (define_expand "negdf2"
   [(set (match_operand:DF         0 "s_register_operand" "")
 	(neg:DF (match_operand:DF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "")
 
 ;; abssi2 doesn't really clobber the condition codes if a different register
@@ -2895,25 +2934,25 @@
 (define_expand "abssf2"
   [(set (match_operand:SF         0 "s_register_operand" "")
 	(abs:SF (match_operand:SF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "")
 
 (define_expand "absdf2"
   [(set (match_operand:DF         0 "s_register_operand" "")
 	(abs:DF (match_operand:DF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "")
 
 (define_expand "sqrtsf2"
   [(set (match_operand:SF 0 "s_register_operand" "")
 	(sqrt:SF (match_operand:SF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "")
 
 (define_expand "sqrtdf2"
   [(set (match_operand:DF 0 "s_register_operand" "")
 	(sqrt:DF (match_operand:DF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "")
 
 (define_insn_and_split "one_cmpldi2"
@@ -2984,9 +3023,9 @@
 (define_expand "floatsisf2"
   [(set (match_operand:SF           0 "s_register_operand" "")
 	(float:SF (match_operand:SI 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
       emit_insn (gen_cirrus_floatsisf2 (operands[0], operands[1]));
       DONE;
@@ -2996,9 +3035,9 @@
 (define_expand "floatsidf2"
   [(set (match_operand:DF           0 "s_register_operand" "")
 	(float:DF (match_operand:SI 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
       emit_insn (gen_cirrus_floatsidf2 (operands[0], operands[1]));
       DONE;
@@ -3008,9 +3047,9 @@
 (define_expand "fix_truncsfsi2"
   [(set (match_operand:SI         0 "s_register_operand" "")
 	(fix:SI (fix:SF (match_operand:SF 1 "s_register_operand"  ""))))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
       if (!cirrus_fp_register (operands[0], SImode))
         operands[0] = force_reg (SImode, operands[0]);
@@ -3024,9 +3063,9 @@
 (define_expand "fix_truncdfsi2"
   [(set (match_operand:SI         0 "s_register_operand" "")
 	(fix:SI (fix:DF (match_operand:DF 1 "s_register_operand"  ""))))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS)
+  if (TARGET_MAVERICK)
     {
       if (!cirrus_fp_register (operands[1], DFmode))
         operands[1] = force_reg (DFmode, operands[0]);
@@ -3041,7 +3080,7 @@
   [(set (match_operand:SF  0 "s_register_operand" "")
 	(float_truncate:SF
  	 (match_operand:DF 1 "s_register_operand" "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   ""
 )
 
@@ -3070,7 +3109,7 @@
    ldr%?b\\t%Q0, %1\;mov%?\\t%R0, #0"
   [(set_attr "length" "8")
    (set_attr "predicable" "yes")
-   (set_attr "type" "*,load")
+   (set_attr "type" "*,load_byte")
    (set_attr "pool_range" "*,4092")
    (set_attr "neg_pool_range" "*,4084")]
 )
@@ -3099,72 +3138,42 @@
   "TARGET_EITHER"
   "
   {
-    if (TARGET_ARM)
+    if ((TARGET_THUMB || arm_arch4) && GET_CODE (operands[1]) == MEM)
       {
-        if (arm_arch4 && GET_CODE (operands[1]) == MEM)
-          {
-           /* Note: We do not have to worry about TARGET_MMU_TRAPS
-	      here because the insn below will generate an LDRH instruction
-	      rather than an LDR instruction, so we cannot get an unaligned
-	      word access.  */
-            emit_insn (gen_rtx_SET (VOIDmode, operands[0],
-			            gen_rtx_ZERO_EXTEND (SImode,
-							 operands[1])));
-            DONE;
-          }
-        if (TARGET_MMU_TRAPS && GET_CODE (operands[1]) == MEM)
-          {
-            emit_insn (gen_movhi_bytes (operands[0], operands[1]));
-            DONE;
-          }
-        if (!s_register_operand (operands[1], HImode))
-          operands[1] = copy_to_mode_reg (HImode, operands[1]);
-        operands[1] = gen_lowpart (SImode, operands[1]);
-        operands[2] = gen_reg_rtx (SImode);
+       /* Note: We do not have to worry about TARGET_MMU_TRAPS
+	  here because the insn below will generate an LDRH instruction
+	  rather than an LDR instruction, so we cannot get an unaligned
+	  word access.  */
+	emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+				gen_rtx_ZERO_EXTEND (SImode, operands[1])));
+	DONE;
       }
-    else /* TARGET_THUMB */
-      {
-        if (GET_CODE (operands[1]) == MEM)
-	  {
-	    rtx tmp;
 
-	    tmp = gen_rtx_ZERO_EXTEND (SImode, operands[1]);
-	    tmp = gen_rtx_SET (VOIDmode, operands[0], tmp);
-	    emit_insn (tmp);
-	  }
-	else
-	  {
-	    rtx ops[3];
-	    
-	    if (!s_register_operand (operands[1], HImode))
-	      operands[1] = copy_to_mode_reg (HImode, operands[1]);
-	    operands[1] = gen_lowpart (SImode, operands[1]);
-	    operands[2] = gen_reg_rtx (SImode);
-	    
-	    ops[0] = operands[2];
-	    ops[1] = operands[1];
-	    ops[2] = GEN_INT (16);
-	    
-	    emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-				    gen_rtx_ASHIFT (SImode, ops[1], ops[2])));
+    if (TARGET_ARM && TARGET_MMU_TRAPS && GET_CODE (operands[1]) == MEM)
+      {
+	emit_insn (gen_movhi_bytes (operands[0], operands[1]));
+	DONE;
+      }
 
-	    ops[0] = operands[0];
-	    ops[1] = operands[2];
-	    ops[2] = GEN_INT (16);
+    if (!s_register_operand (operands[1], HImode))
+      operands[1] = copy_to_mode_reg (HImode, operands[1]);
 
-	    emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-				    gen_rtx_LSHIFTRT (SImode, ops[1],
-						      ops[2])));
-	  }
-	DONE; 
+    if (arm_arch6)
+      {
+	emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+				gen_rtx_ZERO_EXTEND (SImode, operands[1])));
+	DONE;
       }
+
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = gen_reg_rtx (SImode);
   }"
 )
 
 (define_insn "*thumb_zero_extendhisi2"
-  [(set (match_operand:SI                 0 "register_operand" "=l")
-	(zero_extend:SI (match_operand:HI 1 "memory_operand"    "m")))]
-  "TARGET_THUMB"
+  [(set (match_operand:SI 0 "register_operand" "=l")
+	(zero_extend:SI (match_operand:HI 1 "memory_operand" "m")))]
+  "TARGET_THUMB && !arm_arch6"
   "*
   rtx mem = XEXP (operands[1], 0);
 
@@ -3199,21 +3208,91 @@
   return \"ldrh\\t%0, %1\";
   "
   [(set_attr "length" "4")
-   (set_attr "type" "load")
+   (set_attr "type" "load_byte")
    (set_attr "pool_range" "60")]
 )
 
+(define_insn "*thumb_zero_extendhisi2_v6"
+  [(set (match_operand:SI 0 "register_operand" "=l,l")
+	(zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "l,m")))]
+  "TARGET_THUMB && arm_arch6"
+  "*
+  rtx mem;
+
+  if (which_alternative == 0)
+    return \"uxth\\t%0, %1\";
+
+  mem = XEXP (operands[1], 0);
+
+  if (GET_CODE (mem) == CONST)
+    mem = XEXP (mem, 0);
+    
+  if (GET_CODE (mem) == LABEL_REF)
+    return \"ldr\\t%0, %1\";
+    
+  if (GET_CODE (mem) == PLUS)
+    {
+      rtx a = XEXP (mem, 0);
+      rtx b = XEXP (mem, 1);
+
+      /* This can happen due to bugs in reload.  */
+      if (GET_CODE (a) == REG && REGNO (a) == SP_REGNUM)
+        {
+          rtx ops[2];
+          ops[0] = operands[0];
+          ops[1] = a;
+      
+          output_asm_insn (\"mov	%0, %1\", ops);
+
+          XEXP (mem, 0) = operands[0];
+       }
+
+      else if (   GET_CODE (a) == LABEL_REF
+	       && GET_CODE (b) == CONST_INT)
+        return \"ldr\\t%0, %1\";
+    }
+    
+  return \"ldrh\\t%0, %1\";
+  "
+  [(set_attr "length" "2,4")
+   (set_attr "type" "alu_shift,load_byte")
+   (set_attr "pool_range" "*,60")]
+)
+
 (define_insn "*arm_zero_extendhisi2"
-  [(set (match_operand:SI                 0 "s_register_operand" "=r")
-	(zero_extend:SI (match_operand:HI 1 "memory_operand"      "m")))]
-  "TARGET_ARM && arm_arch4"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(zero_extend:SI (match_operand:HI 1 "memory_operand" "m")))]
+  "TARGET_ARM && arm_arch4 && !arm_arch6"
   "ldr%?h\\t%0, %1"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "256")
    (set_attr "neg_pool_range" "244")]
 )
 
+(define_insn "*arm_zero_extendhisi2_v6"
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
+	(zero_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
+  "TARGET_ARM && arm_arch6"
+  "@
+   uxth%?\\t%0, %1
+   ldr%?h\\t%0, %1"
+  [(set_attr "type" "alu_shift,load_byte")
+   (set_attr "predicable" "yes")
+   (set_attr "pool_range" "*,256")
+   (set_attr "neg_pool_range" "*,244")]
+)
+
+(define_insn "*arm_zero_extendhisi2addsi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(plus:SI (zero_extend:SI (match_operand:HI 1 "s_register_operand" "r"))
+		 (match_operand:SI 2 "s_register_operand" "r")))]
+  "TARGET_ARM && arm_arch6"
+  "uxtah%?\\t%0, %2, %1"
+  [(set_attr "type" "alu_shift")
+   (set_attr "predicable" "yes")]
+)
+
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(zero_extend:SI (match_operand:HI 1 "alignable_memory_operand" "")))
@@ -3249,7 +3328,7 @@
 	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "")))]
   "TARGET_EITHER"
   "
-  if (GET_CODE (operands[1]) != MEM)
+  if (!arm_arch6 && GET_CODE (operands[1]) != MEM)
     {
       if (TARGET_ARM)
         {
@@ -3285,26 +3364,61 @@
 )
 
 (define_insn "*thumb_zero_extendqisi2"
-  [(set (match_operand:SI                 0 "register_operand" "=l")
-	(zero_extend:SI (match_operand:QI 1 "memory_operand"    "m")))]
-  "TARGET_THUMB"
+  [(set (match_operand:SI 0 "register_operand" "=l")
+	(zero_extend:SI (match_operand:QI 1 "memory_operand" "m")))]
+  "TARGET_THUMB && !arm_arch6"
   "ldrb\\t%0, %1"
   [(set_attr "length" "2")
-   (set_attr "type" "load")
+   (set_attr "type" "load_byte")
    (set_attr "pool_range" "32")]
 )
 
+(define_insn "*thumb_zero_extendqisi2_v6"
+  [(set (match_operand:SI 0 "register_operand" "=l,l")
+	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "l,m")))]
+  "TARGET_THUMB && arm_arch6"
+  "@
+   uxtb\\t%0, %1
+   ldrb\\t%0, %1"
+  [(set_attr "length" "2,2")
+   (set_attr "type" "alu_shift,load_byte")
+   (set_attr "pool_range" "*,32")]
+)
+
 (define_insn "*arm_zero_extendqisi2"
-  [(set (match_operand:SI                 0 "s_register_operand" "=r")
-	(zero_extend:SI (match_operand:QI 1 "memory_operand"      "m")))]
-  "TARGET_ARM"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(zero_extend:SI (match_operand:QI 1 "memory_operand" "m")))]
+  "TARGET_ARM && !arm_arch6"
   "ldr%?b\\t%0, %1\\t%@ zero_extendqisi2"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "4096")
    (set_attr "neg_pool_range" "4084")]
 )
 
+(define_insn "*arm_zero_extendqisi2_v6"
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
+	(zero_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
+  "TARGET_ARM && arm_arch6"
+  "@
+   uxtb%?\\t%0, %1
+   ldr%?b\\t%0, %1\\t%@ zero_extendqisi2"
+  [(set_attr "type" "alu_shift,load_byte")
+   (set_attr "predicable" "yes")
+   (set_attr "pool_range" "*,4096")
+   (set_attr "neg_pool_range" "*,4084")]
+)
+
+(define_insn "*arm_zero_extendqisi2addsi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(plus:SI (zero_extend:SI (match_operand:QI 1 "s_register_operand" "r"))
+		 (match_operand:SI 2 "s_register_operand" "r")))]
+  "TARGET_ARM && arm_arch6"
+  "uxtab%?\\t%0, %2, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "alu_shift")]
+)
+
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(zero_extend:SI (subreg:QI (match_operand:SI 1 "" "") 0)))
@@ -3334,15 +3448,23 @@
   "TARGET_EITHER"
   "
   {
-    if (TARGET_ARM && arm_arch4 && GET_CODE (operands[1]) == MEM)
+    if (GET_CODE (operands[1]) == MEM)
       {
-       /* Note: We do not have to worry about TARGET_MMU_TRAPS
-	  here because the insn below will generate an LDRH instruction
-	  rather than an LDR instruction, so we cannot get an unaligned
-	  word access.  */
-        emit_insn (gen_rtx_SET (VOIDmode, operands[0],
-		   gen_rtx_SIGN_EXTEND (SImode, operands[1])));
-        DONE;
+	if (TARGET_THUMB)
+	  {
+	    emit_insn (gen_thumb_extendhisi2 (operands[0], operands[1]));
+	    DONE;
+          }
+	else if (arm_arch4)
+	  {
+	    /* Note: We do not have to worry about TARGET_MMU_TRAPS
+	       here because the insn below will generate an LDRH instruction
+	       rather than an LDR instruction, so we cannot get an unaligned
+	       word access.  */
+	    emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+		       gen_rtx_SIGN_EXTEND (SImode, operands[1])));
+	    DONE;
+	  }
       }
 
     if (TARGET_ARM && TARGET_MMU_TRAPS && GET_CODE (operands[1]) == MEM)
@@ -3350,39 +3472,31 @@
         emit_insn (gen_extendhisi2_mem (operands[0], operands[1]));
         DONE;
       }
+
     if (!s_register_operand (operands[1], HImode))
       operands[1] = copy_to_mode_reg (HImode, operands[1]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = gen_reg_rtx (SImode);
 
-    if (TARGET_THUMB)
+    if (arm_arch6)
       {
-	rtx ops[3];
-	
-	ops[0] = operands[2];
-	ops[1] = operands[1];
-	ops[2] = GEN_INT (16);
-	
-        emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-				gen_rtx_ASHIFT (SImode, ops[1], ops[2])));
-	    
-	ops[0] = operands[0];
-	ops[1] = operands[2];
-	ops[2] = GEN_INT (16);
-	
-        emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-				gen_rtx_ASHIFTRT (SImode, ops[1], ops[2])));
-	
+	if (TARGET_THUMB)
+	  emit_insn (gen_thumb_extendhisi2 (operands[0], operands[1]));
+	else
+	  emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+		     gen_rtx_SIGN_EXTEND (SImode, operands[1])));
+
 	DONE;
       }
+
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = gen_reg_rtx (SImode);
   }"
 )
 
-(define_insn "*thumb_extendhisi2_insn"
-  [(set (match_operand:SI                 0 "register_operand" "=l")
-	(sign_extend:SI (match_operand:HI 1 "memory_operand"    "m")))
-   (clobber (match_scratch:SI             2                   "=&l"))]
-  "TARGET_THUMB"
+(define_insn "thumb_extendhisi2"
+  [(set (match_operand:SI 0 "register_operand" "=l")
+	(sign_extend:SI (match_operand:HI 1 "memory_operand" "m")))
+   (clobber (match_scratch:SI 2 "=&l"))]
+  "TARGET_THUMB && !arm_arch6"
   "*
   {
     rtx ops[4];
@@ -3432,10 +3546,83 @@
     return \"\";
   }"
   [(set_attr "length" "4")
-   (set_attr "type" "load")
+   (set_attr "type" "load_byte")
    (set_attr "pool_range" "1020")]
 )
 
+;; We used to have an early-clobber on the scratch register here.
+;; However, there's a bug somewhere in reload which means that this
+;; can be partially ignored during spill allocation if the memory
+;; address also needs reloading; this causes an abort later on when
+;; we try to verify the operands.  Fortunately, we don't really need
+;; the early-clobber: we can always use operand 0 if operand 2
+;; overlaps the address.
+(define_insn "*thumb_extendhisi2_insn_v6"
+  [(set (match_operand:SI 0 "register_operand" "=l,l")
+	(sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "l,m")))
+   (clobber (match_scratch:SI 2 "=X,l"))]
+  "TARGET_THUMB && arm_arch6"
+  "*
+  {
+    rtx ops[4];
+    rtx mem;
+
+    if (which_alternative == 0)
+      return \"sxth\\t%0, %1\";
+
+    mem = XEXP (operands[1], 0);
+
+    /* This code used to try to use 'V', and fix the address only if it was
+       offsettable, but this fails for e.g. REG+48 because 48 is outside the
+       range of QImode offsets, and offsettable_address_p does a QImode
+       address check.  */
+       
+    if (GET_CODE (mem) == CONST)
+      mem = XEXP (mem, 0);
+    
+    if (GET_CODE (mem) == LABEL_REF)
+      return \"ldr\\t%0, %1\";
+    
+    if (GET_CODE (mem) == PLUS)
+      {
+        rtx a = XEXP (mem, 0);
+        rtx b = XEXP (mem, 1);
+
+        if (GET_CODE (a) == LABEL_REF
+	    && GET_CODE (b) == CONST_INT)
+          return \"ldr\\t%0, %1\";
+
+        if (GET_CODE (b) == REG)
+          return \"ldrsh\\t%0, %1\";
+	  
+        ops[1] = a;
+        ops[2] = b;
+      }
+    else
+      {
+        ops[1] = mem;
+        ops[2] = const0_rtx;
+      }
+      
+    if (GET_CODE (ops[1]) != REG)
+      {
+        debug_rtx (ops[1]);
+        abort ();
+      }
+
+    ops[0] = operands[0];
+    if (reg_mentioned_p (operands[2], ops[1]))
+      ops[3] = ops[0];
+    else
+      ops[3] = operands[2];
+    output_asm_insn (\"mov\\t%3, %2\;ldrsh\\t%0, [%1, %3]\", ops);
+    return \"\";
+  }"
+  [(set_attr "length" "2,4")
+   (set_attr "type" "alu_shift,load_byte")
+   (set_attr "pool_range" "*,1020")]
+)
+
 (define_expand "extendhisi2_mem"
   [(set (match_dup 2) (zero_extend:SI (match_operand:HI 1 "" "")))
    (set (match_dup 3)
@@ -3473,17 +3660,38 @@
   }"
 )
 
-(define_insn "*arm_extendhisi_insn"
-  [(set (match_operand:SI                 0 "s_register_operand" "=r")
-	(sign_extend:SI (match_operand:HI 1 "memory_operand"      "m")))]
-  "TARGET_ARM && arm_arch4"
+(define_insn "*arm_extendhisi2"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(sign_extend:SI (match_operand:HI 1 "memory_operand" "m")))]
+  "TARGET_ARM && arm_arch4 && !arm_arch6"
   "ldr%?sh\\t%0, %1"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "256")
    (set_attr "neg_pool_range" "244")]
 )
 
+(define_insn "*arm_extendhisi2_v6"
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
+	(sign_extend:SI (match_operand:HI 1 "nonimmediate_operand" "r,m")))]
+  "TARGET_ARM && arm_arch6"
+  "@
+   sxth%?\\t%0, %1
+   ldr%?sh\\t%0, %1"
+  [(set_attr "type" "alu_shift,load_byte")
+   (set_attr "predicable" "yes")
+   (set_attr "pool_range" "*,256")
+   (set_attr "neg_pool_range" "*,244")]
+)
+
+(define_insn "*arm_extendhisi2addsi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(plus:SI (sign_extend:SI (match_operand:HI 1 "s_register_operand" "r"))
+		 (match_operand:SI 2 "s_register_operand" "r")))]
+  "TARGET_ARM && arm_arch6"
+  "sxtah%?\\t%0, %2, %1"
+)
+
 (define_split
   [(set (match_operand:SI                 0 "s_register_operand" "")
 	(sign_extend:SI (match_operand:HI 1 "alignable_memory_operand" "")))
@@ -3550,7 +3758,7 @@
     return \"#\";
   return \"ldr%?sb\\t%0, %1\";
   "
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")
    (set_attr "length" "8")
    (set_attr "pool_range" "256")
@@ -3601,60 +3809,78 @@
   "TARGET_EITHER"
   "
   {
-    if (TARGET_ARM && arm_arch4 && GET_CODE (operands[1]) == MEM)
+    if ((TARGET_THUMB || arm_arch4) && GET_CODE (operands[1]) == MEM)
       {
-        emit_insn (gen_rtx_SET (VOIDmode,
-			        operands[0],
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0],
 			        gen_rtx_SIGN_EXTEND (SImode, operands[1])));
         DONE;
       }
+
     if (!s_register_operand (operands[1], QImode))
       operands[1] = copy_to_mode_reg (QImode, operands[1]);
-    operands[1] = gen_lowpart (SImode, operands[1]);
-    operands[2] = gen_reg_rtx (SImode);
-    
-    if (TARGET_THUMB)
-      {
-	rtx ops[3];
-	
-	ops[0] = operands[2];
-	ops[1] = operands[1];
-	ops[2] = GEN_INT (24);
-	
-        emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-		   gen_rtx_ASHIFT (SImode, ops[1], ops[2])));
 
-	ops[0] = operands[0];
-	ops[1] = operands[2];
-	ops[2] = GEN_INT (24);
-	
-        emit_insn (gen_rtx_SET (VOIDmode, ops[0],
-		   gen_rtx_ASHIFTRT (SImode, ops[1], ops[2])));
-	
-	DONE;
+    if (arm_arch6)
+      {
+        emit_insn (gen_rtx_SET (VOIDmode, operands[0],
+			        gen_rtx_SIGN_EXTEND (SImode, operands[1])));
+        DONE;
       }
+
+    operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = gen_reg_rtx (SImode);
   }"
 )
 
 ; Rather than restricting all byte accesses to memory addresses that ldrsb
 ; can handle, we fix up the ones that ldrsb can't grok with a split.
-(define_insn "*arm_extendqisi_insn"
-  [(set (match_operand:SI                 0 "s_register_operand" "=r")
-	(sign_extend:SI (match_operand:QI 1 "memory_operand"      "m")))]
-  "TARGET_ARM && arm_arch4"
+(define_insn "*arm_extendqisi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(sign_extend:SI (match_operand:QI 1 "memory_operand" "m")))]
+  "TARGET_ARM && arm_arch4 && !arm_arch6"
   "*
   /* If the address is invalid, this will split the instruction into two.  */
   if (bad_signed_byte_operand (operands[1], VOIDmode))
     return \"#\";
   return \"ldr%?sb\\t%0, %1\";
   "
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")
    (set_attr "length" "8")
    (set_attr "pool_range" "256")
    (set_attr "neg_pool_range" "244")]
 )
 
+(define_insn "*arm_extendqisi_v6"
+  [(set (match_operand:SI 0 "s_register_operand" "=r,r")
+	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "r,m")))]
+  "TARGET_ARM && arm_arch6"
+  "*
+  if (which_alternative == 0)
+    return \"sxtb%?\\t%0, %1\";
+
+  /* If the address is invalid, this will split the instruction into two.  */
+  if (bad_signed_byte_operand (operands[1], VOIDmode))
+    return \"#\";
+
+  return \"ldr%?sb\\t%0, %1\";
+  "
+  [(set_attr "type" "alu_shift,load_byte")
+   (set_attr "predicable" "yes")
+   (set_attr "length" "4,8")
+   (set_attr "pool_range" "*,256")
+   (set_attr "neg_pool_range" "*,244")]
+)
+
+(define_insn "*arm_extendqisi2addsi"
+  [(set (match_operand:SI 0 "s_register_operand" "=r")
+	(plus:SI (sign_extend:SI (match_operand:QI 1 "s_register_operand" "r"))
+		 (match_operand:SI 2 "s_register_operand" "r")))]
+  "TARGET_ARM && arm_arch6"
+  "sxtab%?\\t%0, %2, %1"
+  [(set_attr "type" "alu_shift")
+   (set_attr "predicable" "yes")]
+)
+
 (define_split
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(sign_extend:SI (match_operand:QI 1 "bad_signed_byte_operand" "")))]
@@ -3688,10 +3914,10 @@
   }"
 )
 
-(define_insn "*thumb_extendqisi2_insn"
-  [(set (match_operand:SI                 0 "register_operand" "=l,l")
-	(sign_extend:SI (match_operand:QI 1 "memory_operand"    "V,m")))]
-  "TARGET_THUMB"
+(define_insn "*thumb_extendqisi2"
+  [(set (match_operand:SI 0 "register_operand" "=l,l")
+	(sign_extend:SI (match_operand:QI 1 "memory_operand" "V,m")))]
+  "TARGET_THUMB && !arm_arch6"
   "*
   {
     rtx ops[3];
@@ -3763,14 +3989,95 @@
     return \"\";
   }"
   [(set_attr "length" "2,6")
-   (set_attr "type" "load,load")
+   (set_attr "type" "load_byte,load_byte")
    (set_attr "pool_range" "32,32")]
 )
 
+(define_insn "*thumb_extendqisi2_v6"
+  [(set (match_operand:SI 0 "register_operand" "=l,l,l")
+	(sign_extend:SI (match_operand:QI 1 "nonimmediate_operand" "l,V,m")))]
+  "TARGET_THUMB && arm_arch6"
+  "*
+  {
+    rtx ops[3];
+    rtx mem;
+
+    if (which_alternative == 0)
+      return \"sxtb\\t%0, %1\";
+
+    mem = XEXP (operands[1], 0);
+    
+    if (GET_CODE (mem) == CONST)
+      mem = XEXP (mem, 0);
+    
+    if (GET_CODE (mem) == LABEL_REF)
+      return \"ldr\\t%0, %1\";
+
+    if (GET_CODE (mem) == PLUS
+        && GET_CODE (XEXP (mem, 0)) == LABEL_REF)
+      return \"ldr\\t%0, %1\";
+      
+    if (which_alternative == 0)
+      return \"ldrsb\\t%0, %1\";
+      
+    ops[0] = operands[0];
+    
+    if (GET_CODE (mem) == PLUS)
+      {
+        rtx a = XEXP (mem, 0);
+	rtx b = XEXP (mem, 1);
+	
+        ops[1] = a;
+        ops[2] = b;
+
+        if (GET_CODE (a) == REG)
+	  {
+	    if (GET_CODE (b) == REG)
+              output_asm_insn (\"ldrsb\\t%0, [%1, %2]\", ops);
+            else if (REGNO (a) == REGNO (ops[0]))
+	      {
+                output_asm_insn (\"ldrb\\t%0, [%1, %2]\", ops);
+		output_asm_insn (\"sxtb\\t%0, %0\", ops);
+	      }
+	    else
+              output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops);
+	  }
+        else if (GET_CODE (b) != REG)
+	  abort ();
+	else
+          {
+            if (REGNO (b) == REGNO (ops[0]))
+	      {
+                output_asm_insn (\"ldrb\\t%0, [%2, %1]\", ops);
+		output_asm_insn (\"sxtb\\t%0, %0\", ops);
+	      }
+	    else
+              output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops);
+          }
+      }
+    else if (GET_CODE (mem) == REG && REGNO (ops[0]) == REGNO (mem))
+      {
+        output_asm_insn (\"ldrb\\t%0, [%0, #0]\", ops);
+	output_asm_insn (\"sxtb\\t%0, %0\", ops);
+      }
+    else
+      {
+        ops[1] = mem;
+        ops[2] = const0_rtx;
+	
+        output_asm_insn (\"mov\\t%0, %2\;ldrsb\\t%0, [%1, %0]\", ops);
+      }
+    return \"\";
+  }"
+  [(set_attr "length" "2,2,4")
+   (set_attr "type" "alu_shift,load_byte,load_byte")
+   (set_attr "pool_range" "*,32,32")]
+)
+
 (define_expand "extendsfdf2"
   [(set (match_operand:DF                  0 "s_register_operand" "")
 	(float_extend:DF (match_operand:SF 1 "s_register_operand"  "")))]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   ""
 )
 
@@ -3854,12 +4161,14 @@
 (define_insn "*arm_movdi"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r, o<>")
 	(match_operand:DI 1 "di_operand"              "rIK,mi,r"))]
-  "TARGET_ARM && !TARGET_CIRRUS && ! TARGET_IWMMXT"
+  "TARGET_ARM
+  && !(TARGET_HARD_FLOAT && (TARGET_MAVERICK || TARGET_VFP))
+  && !TARGET_IWMMXT"
   "*
   return (output_move_double (operands));
   "
   [(set_attr "length" "8")
-   (set_attr "type" "*,load,store2")
+   (set_attr "type" "*,load2,store2")
    (set_attr "pool_range" "*,1020,*")
    (set_attr "neg_pool_range" "*,1008,*")]
 )
@@ -3872,7 +4181,7 @@
   [(set (match_operand:DI 0 "nonimmediate_operand" "=l,l,l,l,>,l, m,*r")
 	(match_operand:DI 1 "general_operand"      "l, I,J,>,l,mi,l,*r"))]
   "TARGET_THUMB
-   && !TARGET_CIRRUS
+   && !(TARGET_HARD_FLOAT && TARGET_MAVERICK)
    && (   register_operand (operands[0], DImode)
        || register_operand (operands[1], DImode))"
   "*
@@ -3907,7 +4216,7 @@
     }
   }"
   [(set_attr "length" "4,4,6,2,2,6,4,4")
-   (set_attr "type" "*,*,*,load,store2,load,store2,*")
+   (set_attr "type" "*,*,*,load2,store2,load2,store2,*")
    (set_attr "pool_range" "*,*,*,*,*,1020,*,*")]
 )
 
@@ -3921,7 +4230,8 @@
       /* Everything except mem = const or mem = mem can be done easily.  */
       if (GET_CODE (operands[0]) == MEM)
         operands[1] = force_reg (SImode, operands[1]);
-      if (GET_CODE (operands[1]) == CONST_INT
+      if (arm_general_register_operand (operands[0], SImode)
+	  && GET_CODE (operands[1]) == CONST_INT
           && !(const_ok_for_arm (INTVAL (operands[1]))
                || const_ok_for_arm (~INTVAL (operands[1]))))
         {
@@ -3954,6 +4264,7 @@
   [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r, m")
 	(match_operand:SI 1 "general_operand"      "rI,K,mi,r"))]
   "TARGET_ARM && ! TARGET_IWMMXT
+   && !(TARGET_HARD_FLOAT && TARGET_VFP)
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
   "@
@@ -3961,14 +4272,14 @@
    mvn%?\\t%0, #%B1
    ldr%?\\t%0, %1
    str%?\\t%1, %0"
-  [(set_attr "type" "*,*,load,store1")
+  [(set_attr "type" "*,*,load1,store1")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "*,*,4096,*")
    (set_attr "neg_pool_range" "*,*,4084,*")]
 )
 
 (define_split
-  [(set (match_operand:SI 0 "s_register_operand" "")
+  [(set (match_operand:SI 0 "arm_general_register_operand" "")
 	(match_operand:SI 1 "const_int_operand" ""))]
   "TARGET_ARM
   && (!(const_ok_for_arm (INTVAL (operands[1]))
@@ -3998,7 +4309,7 @@
    str\\t%1, %0
    mov\\t%0, %1"
   [(set_attr "length" "2,2,4,4,2,2,2,2,2")
-   (set_attr "type" "*,*,*,*,load,store1,load,store1,*")
+   (set_attr "type" "*,*,*,*,load1,store1,load1,store1,*")
    (set_attr "pool_range" "*,*,*,*,*,*,1020,*,*")]
 )
 
@@ -4050,7 +4361,7 @@
 	(unspec:SI [(match_operand:SI 1 "" "mX")] UNSPEC_PIC_SYM))]
   "TARGET_ARM && flag_pic"
   "ldr%?\\t%0, %1"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set (attr "pool_range")     (const_int 4096))
    (set (attr "neg_pool_range") (const_int 4084))]
 )
@@ -4060,7 +4371,7 @@
 	(unspec:SI [(match_operand:SI 1 "" "mX")] UNSPEC_PIC_SYM))]
   "TARGET_THUMB && flag_pic"
   "ldr\\t%0, %1"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set (attr "pool_range") (const_int 1024))]
 )
 
@@ -4086,7 +4397,7 @@
   output_asm_insn (\"ldr%?\\t%0, %a1\", operands);
   return \"\";
   "
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set (attr "pool_range")
 	(if_then_else (eq_attr "is_thumb" "yes")
 		      (const_int 1024)
@@ -4514,7 +4825,7 @@
       return \"ldrh	%0, %1\";
     }"
   [(set_attr "length" "2,4,2,2,2,2")
-   (set_attr "type" "*,load,store1,*,*,*")]
+   (set_attr "type" "*,load1,store1,*,*,*")]
 )
 
 
@@ -4532,7 +4843,7 @@
     output_asm_insn (\"ldr%?\\t%0, %1\\t%@ load-rotate\", ops);
     return \"\";
   }"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")]
 )
 
@@ -4601,7 +4912,7 @@
    mvn%?\\t%0, #%B1\\t%@ movhi
    str%?h\\t%1, %0\\t%@ movhi 
    ldr%?h\\t%0, %1\\t%@ movhi"
-  [(set_attr "type" "*,*,store1,load")
+  [(set_attr "type" "*,*,store1,load1")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "*,*,*,256")
    (set_attr "neg_pool_range" "*,*,*,244")]
@@ -4621,7 +4932,7 @@
    mov%?\\t%0, %1\\t%@ movhi
    mvn%?\\t%0, #%B1\\t%@ movhi
    ldr%?\\t%0, %1\\t%@ movhi"
-  [(set_attr "type" "*,*,load")
+  [(set_attr "type" "*,*,load1")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "4096")
    (set_attr "neg_pool_range" "4084")]
@@ -4641,7 +4952,7 @@
    mov%?\\t%0, %1\\t%@ movhi
    mvn%?\\t%0, #%B1\\t%@ movhi
    ldr%?\\t%0, %1\\t%@ movhi_bigend\;mov%?\\t%0, %0, asr #16"
-  [(set_attr "type" "*,*,load")
+  [(set_attr "type" "*,*,load1")
    (set_attr "predicable" "yes")
    (set_attr "length" "4,4,8")
    (set_attr "pool_range" "*,*,4092")
@@ -4656,7 +4967,7 @@
    && BYTES_BIG_ENDIAN
    && !TARGET_MMU_TRAPS"
   "ldr%?\\t%0, %1\\t%@ movhi_bigend"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")
    (set_attr "pool_range" "4096")
    (set_attr "neg_pool_range" "4084")]
@@ -4796,7 +5107,7 @@
    mvn%?\\t%0, #%B1
    ldr%?b\\t%0, %1
    str%?b\\t%1, %0"
-  [(set_attr "type" "*,*,load,store1")
+  [(set_attr "type" "*,*,load1,store1")
    (set_attr "predicable" "yes")]
 )
 
@@ -4814,7 +5125,7 @@
    mov\\t%0, %1
    mov\\t%0, %1"
   [(set_attr "length" "2")
-   (set_attr "type" "*,load,store1,*,*,*")
+   (set_attr "type" "*,load1,store1,*,*,*")
    (set_attr "pool_range" "*,32,*,*,*,*")]
 )
 
@@ -4843,7 +5154,7 @@
   [(set (match_operand:SF 0 "nonimmediate_operand" "")
 	(match_operand:SF 1 "immediate_operand" ""))]
   "TARGET_ARM
-   && !TARGET_HARD_FLOAT
+   && !(TARGET_HARD_FLOAT && TARGET_FPA)
    && reload_completed
    && GET_CODE (operands[1]) == CONST_DOUBLE"
   [(set (match_dup 2) (match_dup 3))]
@@ -4859,7 +5170,6 @@
   [(set (match_operand:SF 0 "nonimmediate_operand" "=r,r,m")
 	(match_operand:SF 1 "general_operand"  "r,mE,r"))]
   "TARGET_ARM
-   && !TARGET_CIRRUS
    && TARGET_SOFT_FLOAT
    && (GET_CODE (operands[0]) != MEM
        || register_operand (operands[1], SFmode))"
@@ -4869,7 +5179,7 @@
    str%?\\t%1, %0\\t%@ float"
   [(set_attr "length" "4,4,4")
    (set_attr "predicable" "yes")
-   (set_attr "type" "*,load,store1")
+   (set_attr "type" "*,load1,store1")
    (set_attr "pool_range" "*,4096,*")
    (set_attr "neg_pool_range" "*,4084,*")]
 )
@@ -4890,7 +5200,7 @@
    mov\\t%0, %1
    mov\\t%0, %1"
   [(set_attr "length" "2")
-   (set_attr "type" "*,load,store1,load,store1,*,*")
+   (set_attr "type" "*,load1,store1,load1,store1,*,*")
    (set_attr "pool_range" "*,*,*,1020,*,*,*")]
 )
 
@@ -4962,11 +5272,10 @@
   [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=r,r,m")
 	(match_operand:DF 1 "soft_df_operand" "r,mF,r"))]
   "TARGET_ARM && TARGET_SOFT_FLOAT
-   && !TARGET_CIRRUS
   "
   "* return output_move_double (operands);"
   [(set_attr "length" "8,8,8")
-   (set_attr "type" "*,load,store2")
+   (set_attr "type" "*,load2,store2")
    (set_attr "pool_range" "1020")
    (set_attr "neg_pool_range" "1008")]
 )
@@ -5007,7 +5316,7 @@
     }
   "
   [(set_attr "length" "4,2,2,6,4,4")
-   (set_attr "type" "*,load,store2,load,store2,*")
+   (set_attr "type" "*,load2,store2,load2,store2,*")
    (set_attr "pool_range" "*,*,*,1020,*,*")]
 )
 
@@ -5080,7 +5389,7 @@
 	  (mem:SI (plus:SI (match_dup 2) (const_int 12))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 5"
   "ldm%?ia\\t%1!, {%3, %4, %5, %6}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load4")
    (set_attr "predicable" "yes")]
 )
 
@@ -5097,7 +5406,7 @@
 	  (mem:SI (plus:SI (match_dup 2) (const_int 8))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 4"
   "ldm%?ia\\t%1!, {%3, %4, %5}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load3")
    (set_attr "predicable" "yes")]
 )
 
@@ -5112,7 +5421,7 @@
 	  (mem:SI (plus:SI (match_dup 2) (const_int 4))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 3"
   "ldm%?ia\\t%1!, {%3, %4}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load2")
    (set_attr "predicable" "yes")]
 )
 
@@ -5130,7 +5439,7 @@
 	  (mem:SI (plus:SI (match_dup 1) (const_int 12))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 4"
   "ldm%?ia\\t%1, {%2, %3, %4, %5}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load4")
    (set_attr "predicable" "yes")]
 )
 
@@ -5144,7 +5453,7 @@
 	  (mem:SI (plus:SI (match_dup 1) (const_int 8))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 3"
   "ldm%?ia\\t%1, {%2, %3, %4}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load3")
    (set_attr "predicable" "yes")]
 )
 
@@ -5156,7 +5465,7 @@
 	  (mem:SI (plus:SI (match_dup 1) (const_int 4))))])]
   "TARGET_ARM && XVECLEN (operands[0], 0) == 2"
   "ldm%?ia\\t%1, {%2, %3}"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load2")
    (set_attr "predicable" "yes")]
 )
 
@@ -6406,12 +6715,9 @@
 
 (define_expand "cmpsf"
   [(match_operand:SF 0 "s_register_operand" "")
-   (match_operand:SF 1 "fpa_rhs_operand" "")]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+   (match_operand:SF 1 "arm_float_compare_operand" "")]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS && !cirrus_fp_register (operands[1], SFmode))
-    operands[1] = force_reg (SFmode, operands[1]);
-
   arm_compare_op0 = operands[0];
   arm_compare_op1 = operands[1];
   DONE;
@@ -6420,12 +6726,9 @@
 
 (define_expand "cmpdf"
   [(match_operand:DF 0 "s_register_operand" "")
-   (match_operand:DF 1 "fpa_rhs_operand" "")]
-  "TARGET_ARM && TARGET_ANY_HARD_FLOAT"
+   (match_operand:DF 1 "arm_float_compare_operand" "")]
+  "TARGET_ARM && TARGET_HARD_FLOAT"
   "
-  if (TARGET_CIRRUS && !cirrus_fp_register (operands[1], DFmode))
-    operands[1] = force_reg (DFmode, operands[1]);
-
   arm_compare_op0 = operands[0];
   arm_compare_op1 = operands[1];
   DONE;
@@ -6453,7 +6756,9 @@
   "cmp%?\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*cmpsi_shiftsi_swp"
@@ -6466,7 +6771,9 @@
   "cmp%?\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*cmpsi_neg_shiftsi"
@@ -6479,7 +6786,9 @@
   "cmn%?\\t%0, %1%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "1")
-   ]
+   (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 ;; Cirrus SF compare instruction
@@ -6487,7 +6796,7 @@
   [(set (reg:CCFP CC_REGNUM)
 	(compare:CCFP (match_operand:SF 0 "cirrus_fp_register" "v")
 		      (match_operand:SF 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcmps%?\\tr15, %V0, %V1"
   [(set_attr "type"   "mav_farith")
    (set_attr "cirrus" "compare")]
@@ -6498,7 +6807,7 @@
   [(set (reg:CCFP CC_REGNUM)
 	(compare:CCFP (match_operand:DF 0 "cirrus_fp_register" "v")
 		      (match_operand:DF 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcmpd%?\\tr15, %V0, %V1"
   [(set_attr "type"   "mav_farith")
    (set_attr "cirrus" "compare")]
@@ -6508,7 +6817,7 @@
 (define_expand "cmpdi"
   [(match_operand:DI 0 "cirrus_fp_register" "")
    (match_operand:DI 1 "cirrus_fp_register" "")]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "{
      arm_compare_op0 = operands[0];
      arm_compare_op1 = operands[1];
@@ -6519,7 +6828,7 @@
   [(set (reg:CC CC_REGNUM)
 	(compare:CC (match_operand:DI 0 "cirrus_fp_register" "v")
 		    (match_operand:DI 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcmp64%?\\tr15, %V0, %V1"
   [(set_attr "type"   "mav_farith")
    (set_attr "cirrus" "compare")]
@@ -6637,7 +6946,7 @@
 	(if_then_else (unordered (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNORDERED, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6647,7 +6956,7 @@
 	(if_then_else (ordered (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (ORDERED, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6657,7 +6966,7 @@
 	(if_then_else (ungt (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNGT, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6666,7 +6975,7 @@
 	(if_then_else (unlt (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNLT, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6675,7 +6984,7 @@
 	(if_then_else (unge (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNGE, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6684,7 +6993,7 @@
 	(if_then_else (unle (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNLE, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6695,7 +7004,7 @@
 	(if_then_else (uneq (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNEQ, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6704,7 +7013,7 @@
 	(if_then_else (ltgt (match_dup 1) (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (LTGT, arm_compare_op0, arm_compare_op1);"
 )
 
@@ -6718,7 +7027,7 @@
 	(if_then_else (uneq (match_operand 1 "cc_register" "") (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "*
   if (arm_ccfsm_state != 0)
     abort ();
@@ -6735,7 +7044,7 @@
 	(if_then_else (ltgt (match_operand 1 "cc_register" "") (const_int 0))
 		      (label_ref (match_operand 0 "" ""))
 		      (pc)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "*
   if (arm_ccfsm_state != 0)
     abort ();
@@ -6761,7 +7070,8 @@
     }
   return \"b%d1\\t%l0\";
   "
-  [(set_attr "conds" "use")]
+  [(set_attr "conds" "use")
+   (set_attr "type" "branch")]
 )
 
 ; Special pattern to match reversed UNEQ.
@@ -6770,7 +7080,7 @@
 	(if_then_else (uneq (match_operand 1 "cc_register" "") (const_int 0))
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "*
   if (arm_ccfsm_state != 0)
     abort ();
@@ -6787,7 +7097,7 @@
 	(if_then_else (ltgt (match_operand 1 "cc_register" "") (const_int 0))
 		      (pc)
 		      (label_ref (match_operand 0 "" ""))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "*
   if (arm_ccfsm_state != 0)
     abort ();
@@ -6813,7 +7123,8 @@
     }
   return \"b%D1\\t%l0\";
   "
-  [(set_attr "conds" "use")]
+  [(set_attr "conds" "use")
+   (set_attr "type" "branch")]
 )
 
 
@@ -6893,7 +7204,7 @@
 (define_expand "sunordered"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(unordered:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNORDERED, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6901,7 +7212,7 @@
 (define_expand "sordered"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(ordered:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (ORDERED, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6909,7 +7220,7 @@
 (define_expand "sungt"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(ungt:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNGT, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6917,7 +7228,7 @@
 (define_expand "sunge"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(unge:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNGE, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6925,7 +7236,7 @@
 (define_expand "sunlt"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(unlt:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNLT, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6933,7 +7244,7 @@
 (define_expand "sunle"
   [(set (match_operand:SI 0 "s_register_operand" "")
 	(unle:SI (match_dup 1) (const_int 0)))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "operands[1] = arm_gen_compare_reg (UNLE, arm_compare_op0,
 				      arm_compare_op1);"
 )
@@ -6944,14 +7255,14 @@
 ; (define_expand "suneq"
 ;   [(set (match_operand:SI 0 "s_register_operand" "")
 ; 	(uneq:SI (match_dup 1) (const_int 0)))]
-;   "TARGET_ARM && TARGET_HARD_FLOAT"
+;   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
 ;   "abort ();"
 ; )
 ;
 ; (define_expand "sltgt"
 ;   [(set (match_operand:SI 0 "s_register_operand" "")
 ; 	(ltgt:SI (match_dup 1) (const_int 0)))]
-;   "TARGET_ARM && TARGET_HARD_FLOAT"
+;   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
 ;   "abort ();"
 ; )
 
@@ -7022,9 +7333,9 @@
       FAIL;
 
     /* When compiling for SOFT_FLOAT, ensure both arms are in registers. 
-       Otherwise, ensure it is a valid FP add operand.  */
-    if ((!TARGET_HARD_FLOAT)
-        || (!fpa_add_operand (operands[3], SFmode)))
+       Otherwise, ensure it is a valid FP add operand */
+    if ((!(TARGET_HARD_FLOAT && TARGET_FPA))
+        || (!arm_float_add_operand (operands[3], SFmode)))
       operands[3] = force_reg (SFmode, operands[3]);
 
     ccreg = arm_gen_compare_reg (code, arm_compare_op0, arm_compare_op1);
@@ -7036,8 +7347,8 @@
   [(set (match_operand:DF 0 "s_register_operand" "")
 	(if_then_else:DF (match_operand 1 "arm_comparison_operator" "")
 			 (match_operand:DF 2 "s_register_operand" "")
-			 (match_operand:DF 3 "fpa_add_operand" "")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+			 (match_operand:DF 3 "arm_float_add_operand" "")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && (TARGET_FPA || TARGET_VFP)"
   "
   {
     enum rtx_code code = GET_CODE (operands[1]);
@@ -7403,7 +7714,7 @@
       }
     return output_return_instruction (const_true_rtx, TRUE, FALSE);
   }"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "length" "12")
    (set_attr "predicable" "yes")]
 )
@@ -7426,7 +7737,7 @@
   }"
   [(set_attr "conds" "use")
    (set_attr "length" "12")
-   (set_attr "type" "load")]
+   (set_attr "type" "load1")]
 )
 
 (define_insn "*cond_return_inverted"
@@ -7446,7 +7757,7 @@
     return output_return_instruction (operands[0], TRUE, TRUE);
   }"
   [(set_attr "conds" "use")
-   (set_attr "type" "load")]
+   (set_attr "type" "load1")]
 )
 
 ;; Generate a sequence of instructions to determine if the processor is
@@ -7590,7 +7901,7 @@
 	(match_operand:SI 0 "memory_operand" "m"))]
   "TARGET_ARM"
   "ldr%?\\t%|pc, %0\\t%@ indirect memory jump"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "pool_range" "4096")
    (set_attr "neg_pool_range" "4084")
    (set_attr "predicable" "yes")]
@@ -7636,7 +7947,9 @@
   "%i1%?\\t%0, %2, %4%S3"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "4")
-   ]
+   (set (attr "type") (if_then_else (match_operand 5 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_split
@@ -7672,7 +7985,9 @@
   "%i1%?s\\t%0, %2, %4%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "4")
-   ]
+   (set (attr "type") (if_then_else (match_operand 5 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*arith_shiftsi_compare0_scratch"
@@ -7688,7 +8003,9 @@
   "%i1%?s\\t%0, %2, %4%S3"
   [(set_attr "conds" "set")
    (set_attr "shift" "4")
-   ]
+   (set (attr "type") (if_then_else (match_operand 5 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*sub_shiftsi"
@@ -7701,7 +8018,9 @@
   "sub%?\\t%0, %1, %3%S2"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "3")
-   ]
+   (set (attr "type") (if_then_else (match_operand 4 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*sub_shiftsi_compare0"
@@ -7718,8 +8037,10 @@
   "TARGET_ARM"
   "sub%?s\\t%0, %1, %3%S2"
   [(set_attr "conds" "set")
-   (set_attr "shift" "3") 
-   ]
+   (set_attr "shift" "3")
+   (set (attr "type") (if_then_else (match_operand 4 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*sub_shiftsi_compare0_scratch"
@@ -7734,8 +8055,10 @@
   "TARGET_ARM"
   "sub%?s\\t%0, %1, %3%S2"
   [(set_attr "conds" "set")
-   (set_attr "shift" "3") 
-   ]
+   (set_attr "shift" "3")
+   (set (attr "type") (if_then_else (match_operand 4 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 
@@ -8608,7 +8931,10 @@
    mvn%D5\\t%0, #%B1\;mov%d5\\t%0, %2%S4"
   [(set_attr "conds" "use")
    (set_attr "shift" "2")
-   (set_attr "length" "4,8,8")]
+   (set_attr "length" "4,8,8")
+   (set (attr "type") (if_then_else (match_operand 3 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*ifcompare_move_shift"
@@ -8644,7 +8970,10 @@
    mvn%d5\\t%0, #%B1\;mov%D5\\t%0, %2%S4"
   [(set_attr "conds" "use")
    (set_attr "shift" "2")
-   (set_attr "length" "4,8,8")]
+   (set_attr "length" "4,8,8")
+   (set (attr "type") (if_then_else (match_operand 3 "const_int_operand" "")
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*ifcompare_shift_shift"
@@ -8681,7 +9010,12 @@
   "mov%d5\\t%0, %1%S6\;mov%D5\\t%0, %3%S7"
   [(set_attr "conds" "use")
    (set_attr "shift" "1")
-   (set_attr "length" "8")]
+   (set_attr "length" "8")
+   (set (attr "type") (if_then_else
+		        (and (match_operand 2 "const_int_operand" "")
+                             (match_operand 4 "const_int_operand" ""))
+		      (const_string "alu_shift")
+		      (const_string "alu_shift_reg")))]
 )
 
 (define_insn "*ifcompare_not_arith"
@@ -8882,7 +9216,7 @@
   }"
   [(set_attr "length" "12")
    (set_attr "predicable" "yes")
-   (set_attr "type" "load")]
+   (set_attr "type" "load1")]
 )
 
 ;; the arm can support extended pre-inc instructions
@@ -8939,7 +9273,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?b\\t%3, [%0, %2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -8955,7 +9289,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?b\\t%3, [%0, -%2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -8972,7 +9306,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?b\\t%3, [%0, %2]!\\t%@ z_extendqisi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -8989,7 +9323,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?b\\t%3, [%0, -%2]!\\t%@ z_extendqisi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9037,7 +9371,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?\\t%3, [%0, %2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")]
 )
 
@@ -9053,7 +9387,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?\\t%3, [%0, -%2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")]
 )
 
@@ -9072,7 +9406,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?\\t%3, [%0, %2]!\\t%@ loadhi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9091,7 +9425,7 @@
    && (GET_CODE (operands[2]) != REG
        || REGNO (operands[2]) != FRAME_POINTER_REGNUM)"
   "ldr%?\\t%3, [%0, -%2]!\\t%@ loadhi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9145,7 +9479,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?b\\t%5, [%0, %3%S2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9163,7 +9497,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?b\\t%5, [%0, -%3%S2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9217,7 +9551,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?\\t%5, [%0, %3%S2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")]
 )
 
@@ -9235,7 +9569,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?\\t%5, [%0, -%3%S2]!"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load1")
    (set_attr "predicable" "yes")])
 
 (define_insn "*loadhi_shiftpreinc"
@@ -9255,7 +9589,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?\\t%5, [%0, %3%S2]!\\t%@ loadhi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9276,7 +9610,7 @@
    && REGNO (operands[1]) != FRAME_POINTER_REGNUM
    && REGNO (operands[3]) != FRAME_POINTER_REGNUM"
   "ldr%?\\t%5, [%0, -%3%S2]!\\t%@ loadhi"
-  [(set_attr "type" "load")
+  [(set_attr "type" "load_byte")
    (set_attr "predicable" "yes")]
 )
 
@@ -9388,7 +9722,7 @@
    (set (reg:CC CC_REGNUM)
 	(compare:CC (match_dup 1) (const_int 0)))]
   "TARGET_ARM
-   && (!TARGET_CIRRUS
+   && (!(TARGET_HARD_FLOAT && TARGET_MAVERICK)
        || (!cirrus_fp_register (operands[0], SImode)
            && !cirrus_fp_register (operands[1], SImode)))
   "
@@ -9832,7 +10166,7 @@
     [(set (match_operand:BLK 0 "memory_operand" "=m")
 	  (unspec:BLK [(match_operand:XF 1 "f_register_operand" "f")]
 		      UNSPEC_PUSH_MULT))])]
-  "TARGET_ARM"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "*
   {
     char pattern[100];
@@ -10052,3 +10386,6 @@
 (include "cirrus.md")
 ;; Load the Intel Wireless Multimedia Extension patterns
 (include "iwmmxt.md")
+;; Load the VFP co-processor patterns
+(include "vfp.md")
+
diff --git a/gcc/config/arm/arm1026ejs.md b/gcc/config/arm/arm1026ejs.md
new file mode 100644
index 00000000000..5dd433269ac
--- /dev/null
+++ b/gcc/config/arm/arm1026ejs.md
@@ -0,0 +1,241 @@
+;; ARM 1026EJ-S Pipeline Description
+;; Copyright (C) 2003 Free Software Foundation, Inc.
+;; Written by CodeSourcery, LLC.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+;; 02111-1307, USA.  */
+
+;; These descriptions are based on the information contained in the
+;; ARM1026EJ-S Technical Reference Manual, Copyright (c) 2003 ARM
+;; Limited.
+;;
+
+;; This automaton provides a pipeline description for the ARM
+;; 1026EJ-S core.
+;;
+;; The model given here assumes that the condition for all conditional
+;; instructions is "true", i.e., that all of the instructions are
+;; actually executed.
+
+(define_automaton "arm1026ejs")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Pipelines
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; There are two pipelines:
+;; 
+;; - An Arithmetic Logic Unit (ALU) pipeline.
+;;
+;;   The ALU pipeline has fetch, issue, decode, execute, memory, and
+;;   write stages. We only need to model the execute, memory and write
+;;   stages.
+;;
+;; - A Load-Store Unit (LSU) pipeline.
+;;
+;;   The LSU pipeline has decode, execute, memory, and write stages.
+;;   We only model the execute, memory and write stages.
+
+(define_cpu_unit "a_e,a_m,a_w" "arm1026ejs")
+(define_cpu_unit "l_e,l_m,l_w" "arm1026ejs")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; ALU Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; ALU instructions require three cycles to execute, and use the ALU
+;; pipeline in each of the three stages.  The results are available
+;; after the execute stage stage has finished.
+;;
+;; If the destination register is the PC, the pipelines are stalled
+;; for several cycles.  That case is not modeled here.
+
+;; ALU operations with no shifted operand
+(define_insn_reservation "alu_op" 1 
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "alu"))
+ "a_e,a_m,a_w")
+
+;; ALU operations with a shift-by-constant operand
+(define_insn_reservation "alu_shift_op" 1 
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "alu_shift"))
+ "a_e,a_m,a_w")
+
+;; ALU operations with a shift-by-register operand
+;; These really stall in the decoder, in order to read
+;; the shift value in a second cycle. Pretend we take two cycles in
+;; the execute stage.
+(define_insn_reservation "alu_shift_reg_op" 2 
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "alu_shift_reg"))
+ "a_e*2,a_m,a_w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Multiplication Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; Multiplication instructions loop in the execute stage until the
+;; instruction has been passed through the multiplier array enough
+;; times.
+
+;; The result of the "smul" and "smulw" instructions is not available
+;; until after the memory stage.
+(define_insn_reservation "mult1" 2
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "smulxy,smulwy"))
+ "a_e,a_m,a_w")
+
+;; The "smlaxy" and "smlawx" instructions require two iterations through
+;; the execute stage; the result is available immediately following
+;; the execute stage.
+(define_insn_reservation "mult2" 2
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "smlaxy,smlalxy,smlawx"))
+ "a_e*2,a_m,a_w")
+
+;; The "smlalxy", "mul", and "mla" instructions require two iterations
+;; through the execute stage; the result is not available until after
+;; the memory stage.
+(define_insn_reservation "mult3" 3
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "smlalxy,mul,mla"))
+ "a_e*2,a_m,a_w")
+
+;; The "muls" and "mlas" instructions loop in the execute stage for
+;; four iterations in order to set the flags.  The value result is
+;; available after three iterations.
+(define_insn_reservation "mult4" 3
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "muls,mlas"))
+ "a_e*4,a_m,a_w")
+
+;; Long multiply instructions that produce two registers of
+;; output (such as umull) make their results available in two cycles;
+;; the least significant word is available before the most significant
+;; word.  That fact is not modeled; instead, the instructions are
+;; described.as if the entire result was available at the end of the
+;; cycle in which both words are available.
+
+;; The "umull", "umlal", "smull", and "smlal" instructions all take
+;; three iterations through the execute cycle, and make their results
+;; available after the memory cycle.
+(define_insn_reservation "mult5" 4
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "umull,umlal,smull,smlal"))
+ "a_e*3,a_m,a_w")
+
+;; The "umulls", "umlals", "smulls", and "smlals" instructions loop in
+;; the execute stage for five iterations in order to set the flags.
+;; The value result is vailable after four iterations.
+(define_insn_reservation "mult6" 4
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "insn" "umulls,umlals,smulls,smlals"))
+ "a_e*5,a_m,a_w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Load/Store Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; The models for load/store instructions do not accurately describe
+;; the difference between operations with a base register writeback
+;; (such as "ldm!").  These models assume that all memory references
+;; hit in dcache.
+
+;; LSU instructions require six cycles to execute.  They use the ALU
+;; pipeline in all but the 5th cycle, and the LSU pipeline in cycles
+;; three through six.
+;; Loads and stores which use a scaled register offset or scaled
+;; register pre-indexed addressing mode take three cycles EXCEPT for
+;; those that are base + offset with LSL of 0 or 2, or base - offset
+;; with LSL of zero.  The remainder take 1 cycle to execute.
+;; For 4byte loads there is a bypass from the load stage
+
+(define_insn_reservation "load1_op" 2
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "load_byte,load1"))
+ "a_e+l_e,l_m,a_w+l_w")
+
+(define_insn_reservation "store1_op" 0
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "store1"))
+ "a_e+l_e,l_m,a_w+l_w")
+
+;; A load's result can be stored by an immediately following store
+(define_bypass 1 "load1_op" "store1_op" "arm_no_early_store_addr_dep")
+
+;; On a LDM/STM operation, the LSU pipeline iterates until all of the
+;; registers have been processed.
+;;
+;; The time it takes to load the data depends on whether or not the
+;; base address is 64-bit aligned; if it is not, an additional cycle
+;; is required.  This model assumes that the address is always 64-bit
+;; aligned.  Because the processor can load two registers per cycle,
+;; that assumption means that we use the same instruction rservations
+;; for loading 2k and 2k - 1 registers.
+;;
+;; The ALU pipeline is stalled until the completion of the last memory
+;; stage in the LSU pipeline.  That is modeled by keeping the ALU
+;; execute stage busy until that point.
+;;
+;; As with ALU operations, if one of the destination registers is the
+;; PC, there are additional stalls; that is not modeled.
+
+(define_insn_reservation "load2_op" 2
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "load2"))
+ "a_e+l_e,l_m,a_w+l_w")
+
+(define_insn_reservation "store2_op" 0
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "store2"))
+ "a_e+l_e,l_m,a_w+l_w")
+
+(define_insn_reservation "load34_op" 3
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "load3,load4"))
+ "a_e+l_e,a_e+l_e+l_m,a_e+l_m,a_w+l_w")
+
+(define_insn_reservation "store34_op" 0
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "store3,store4"))
+ "a_e+l_e,a_e+l_e+l_m,a_e+l_m,a_w+l_w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Branch and Call Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; Branch instructions are difficult to model accurately.  The ARM
+;; core can predict most branches.  If the branch is predicted
+;; correctly, and predicted early enough, the branch can be completely
+;; eliminated from the instruction stream.  Some branches can
+;; therefore appear to require zero cycles to execute.  We assume that
+;; all branches are predicted correctly, and that the latency is
+;; therefore the minimum value.
+
+(define_insn_reservation "branch_op" 0
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "branch"))
+ "nothing")
+
+;; The latency for a call is not predictable.  Therefore, we use 32 as
+;; roughly equivalent to postive infinity.
+
+(define_insn_reservation "call_op" 32
+ (and (eq_attr "tune" "arm1026ejs")
+      (eq_attr "type" "call"))
+ "nothing")
diff --git a/gcc/config/arm/arm1136jfs.md b/gcc/config/arm/arm1136jfs.md
new file mode 100644
index 00000000000..acfce1b5681
--- /dev/null
+++ b/gcc/config/arm/arm1136jfs.md
@@ -0,0 +1,377 @@
+;; ARM 1136J[F]-S Pipeline Description
+;; Copyright (C) 2003 Free Software Foundation, Inc.
+;; Written by CodeSourcery, LLC.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+;; 02111-1307, USA.  */
+
+;; These descriptions are based on the information contained in the
+;; ARM1136JF-S Technical Reference Manual, Copyright (c) 2003 ARM
+;; Limited.
+;;
+
+;; This automaton provides a pipeline description for the ARM
+;; 1136J-S and 1136JF-S cores.
+;;
+;; The model given here assumes that the condition for all conditional
+;; instructions is "true", i.e., that all of the instructions are
+;; actually executed.
+
+(define_automaton "arm1136jfs")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Pipelines
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; There are three distinct pipelines (page 1-26 and following):
+;;
+;; - A 4-stage decode pipeline, shared by all three.  It has fetch (1),
+;;   fetch (2), decode, and issue stages.  Since this is always involved,
+;;   we do not model it in the scheduler.
+;;
+;; - A 4-stage ALU pipeline.  It has shifter, ALU (main integer operations),
+;;   and saturation stages.  The fourth stage is writeback; see below.
+;;
+;; - A 4-stage multiply-accumulate pipeline.  It has three stages, called
+;;   MAC1 through MAC3, and a fourth writeback stage.
+;;
+;;   The 4th-stage writeback is shared between the ALU and MAC pipelines,
+;;   which operate in lockstep.  Results from either pipeline will be
+;;   moved into the writeback stage.  Because the two pipelines operate
+;;   in lockstep, we schedule them as a single "execute" pipeline.
+;;
+;; - A 4-stage LSU pipeline.  It has address generation, data cache (1),
+;;   data cache (2), and writeback stages.  (Note that this pipeline,
+;;   including the writeback stage, is independant from the ALU & LSU pipes.)  
+
+(define_cpu_unit "e_1,e_2,e_3,e_wb" "arm1136jfs")     ; ALU and MAC
+; e_1 = Sh/Mac1, e_2 = ALU/Mac2, e_3 = SAT/Mac3
+(define_cpu_unit "l_a,l_dc1,l_dc2,l_wb" "arm1136jfs") ; Load/Store
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; ALU Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; ALU instructions require eight cycles to execute, and use the ALU
+;; pipeline in each of the eight stages.  The results are available
+;; after the alu stage has finished.
+;;
+;; If the destination register is the PC, the pipelines are stalled
+;; for several cycles.  That case is not modelled here.
+
+;; ALU operations with no shifted operand
+(define_insn_reservation "11_alu_op" 2
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "alu"))
+ "e_1,e_2,e_3,e_wb")
+
+;; ALU operations with a shift-by-constant operand
+(define_insn_reservation "11_alu_shift_op" 2
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "alu_shift"))
+ "e_1,e_2,e_3,e_wb")
+
+;; ALU operations with a shift-by-register operand
+;; These really stall in the decoder, in order to read
+;; the shift value in a second cycle. Pretend we take two cycles in
+;; the shift stage.
+(define_insn_reservation "11_alu_shift_reg_op" 3
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "alu_shift_reg"))
+ "e_1*2,e_2,e_3,e_wb")
+
+;; alu_ops can start sooner, if there is no shifter dependency
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_alu_op")
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_alu_op")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Multiplication Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; Multiplication instructions loop in the first two execute stages until
+;; the instruction has been passed through the multiplier array enough
+;; times.
+
+;; Multiply and multiply-accumulate results are available after four stages.
+(define_insn_reservation "11_mult1" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "mul,mla"))
+ "e_1*2,e_2,e_3,e_wb")
+
+;; The *S variants set the condition flags, which requires three more cycles.
+(define_insn_reservation "11_mult2" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "muls,mlas"))
+ "e_1*2,e_2,e_3,e_wb")
+
+(define_bypass 3 "11_mult1,11_mult2"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 3 "11_mult1,11_mult2"
+	       "11_alu_op")
+(define_bypass 3 "11_mult1,11_mult2"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 3 "11_mult1,11_mult2"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+(define_bypass 3 "11_mult1,11_mult2"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+
+;; Signed and unsigned multiply long results are available across two cycles;
+;; the less significant word is available one cycle before the more significant
+;; word.  Here we conservatively wait until both are available, which is
+;; after three iterations and the memory cycle.  The same is also true of
+;; the two multiply-accumulate instructions.
+(define_insn_reservation "11_mult3" 5
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "smull,umull,smlal,umlal"))
+ "e_1*3,e_2,e_3,e_wb*2")
+
+;; The *S variants set the condition flags, which requires three more cycles.
+(define_insn_reservation "11_mult4" 5
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "smulls,umulls,smlals,umlals"))
+ "e_1*3,e_2,e_3,e_wb*2")
+
+(define_bypass 4 "11_mult3,11_mult4"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 4 "11_mult3,11_mult4"
+	       "11_alu_op")
+(define_bypass 4 "11_mult3,11_mult4"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 4 "11_mult3,11_mult4"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+(define_bypass 4 "11_mult3,11_mult4"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+
+;; Various 16x16->32 multiplies and multiply-accumulates, using combinations
+;; of high and low halves of the argument registers.  They take a single
+;; pass through the pipeline and make the result available after three
+;; cycles.
+(define_insn_reservation "11_mult5" 3
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "smulxy,smlaxy,smulwy,smlawy,smuad,smuadx,smlad,smladx,smusd,smusdx,smlsd,smlsdx"))
+ "e_1,e_2,e_3,e_wb")
+
+(define_bypass 2 "11_mult5"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 2 "11_mult5"
+	       "11_alu_op")
+(define_bypass 2 "11_mult5"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 2 "11_mult5"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+(define_bypass 2 "11_mult5"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+
+;; The same idea, then the 32-bit result is added to a 64-bit quantity.
+(define_insn_reservation "11_mult6" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "smlalxy"))
+ "e_1*2,e_2,e_3,e_wb*2")
+
+;; Signed 32x32 multiply, then the most significant 32 bits are extracted
+;; and are available after the memory stage.
+(define_insn_reservation "11_mult7" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "insn" "smmul,smmulr"))
+ "e_1*2,e_2,e_3,e_wb")
+
+(define_bypass 3 "11_mult6,11_mult7"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 3 "11_mult6,11_mult7"
+	       "11_alu_op")
+(define_bypass 3 "11_mult6,11_mult7"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 3 "11_mult6,11_mult7"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+(define_bypass 3 "11_mult6,11_mult7"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Branch Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; These vary greatly depending on their arguments and the results of
+;; stat prediction.  Cycle count ranges from zero (unconditional branch,
+;; folded dynamic prediction) to seven (incorrect predictions, etc).  We
+;; assume an optimal case for now, because the cost of a cache miss
+;; overwhelms the cost of everything else anyhow.
+
+(define_insn_reservation "11_branches" 0
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "branch"))
+ "nothing")
+
+;; Call latencies are not predictable.  A semi-arbitrary very large
+;; number is used as "positive infinity" so that everything should be
+;; finished by the time of return.
+(define_insn_reservation "11_call" 32
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "call"))
+ "nothing")
+
+;; Branches are predicted. A correctly predicted branch will be no
+;; cost, but we're conservative here, and use the timings a
+;; late-register would give us.
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_branches")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_branches")
+(define_bypass 2 "11_load1,11_load2"
+	       "11_branches")
+(define_bypass 3 "11_load34"
+	       "11_branches")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Load/Store Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; The models for load/store instructions do not accurately describe
+;; the difference between operations with a base register writeback.
+;; These models assume that all memory references hit in dcache.  Also,
+;; if the PC is one of the registers involved, there are additional stalls
+;; not modelled here.  Addressing modes are also not modelled.
+
+(define_insn_reservation "11_load1" 3
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "load1"))
+ "l_a+e_1,l_dc1,l_dc2,l_wb")
+
+;; Load byte results are not available until the writeback stage, where
+;; the correct byte is extracted.
+
+(define_insn_reservation "11_loadb" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "load_byte"))
+ "l_a+e_1,l_dc1,l_dc2,l_wb")
+
+(define_insn_reservation "11_store1" 0
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "store1"))
+ "l_a+e_1,l_dc1,l_dc2,l_wb")
+
+;; Load/store double words into adjacent registers.  The timing and
+;; latencies are different depending on whether the address is 64-bit
+;; aligned.  This model assumes that it is.
+(define_insn_reservation "11_load2" 3
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "load2"))
+ "l_a+e_1,l_dc1,l_dc2,l_wb")
+
+(define_insn_reservation "11_store2" 0
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "store2"))
+ "l_a+e_1,l_dc1,l_dc2,l_wb")
+
+;; Load/store multiple registers.  Two registers are stored per cycle.
+;; Actual timing depends on how many registers are affected, so we
+;; optimistically schedule a low latency.
+(define_insn_reservation "11_load34" 4
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "load3,load4"))
+ "l_a+e_1,l_dc1*2,l_dc2,l_wb")
+
+(define_insn_reservation "11_store34" 0
+ (and (eq_attr "tune" "arm1136js,arm1136jfs")
+      (eq_attr "type" "store3,store4"))
+ "l_a+e_1,l_dc1*2,l_dc2,l_wb")
+
+;; A store can start immediately after an alu op, if that alu op does
+;; not provide part of the address to access.
+(define_bypass 1 "11_alu_op,11_alu_shift_op"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+(define_bypass 2 "11_alu_shift_reg_op"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+
+;; An alu op can start sooner after a load, if that alu op does not
+;; have an early register dependancy on the load
+(define_bypass 2 "11_load1"
+	       "11_alu_op")
+(define_bypass 2 "11_load1"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 2 "11_load1"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+
+(define_bypass 3 "11_loadb"
+	       "11_alu_op")
+(define_bypass 3 "11_loadb"
+	       "11_alu_shift_op"
+	       "arm_no_early_alu_shift_value_dep")
+(define_bypass 3 "11_loadb"
+	       "11_alu_shift_reg_op"
+	       "arm_no_early_alu_shift_dep")
+
+;; A mul op can start sooner after a load, if that mul op does not
+;; have an early multiply dependency
+(define_bypass 2 "11_load1"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 3 "11_load34"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+(define_bypass 3 "11_loadb"
+	       "11_mult1,11_mult2,11_mult3,11_mult4,11_mult5,11_mult6,11_mult7"
+	       "arm_no_early_mul_dep")
+
+;; A store can start sooner after a load, if that load does not
+;; produce part of the address to access
+(define_bypass 2 "11_load1"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
+(define_bypass 3 "11_loadb"
+	       "11_store1"
+	       "arm_no_early_store_addr_dep")
diff --git a/gcc/config/arm/arm926ejs.md b/gcc/config/arm/arm926ejs.md
new file mode 100644
index 00000000000..e8ba17cf551
--- /dev/null
+++ b/gcc/config/arm/arm926ejs.md
@@ -0,0 +1,188 @@
+;; ARM 926EJ-S Pipeline Description
+;; Copyright (C) 2003 Free Software Foundation, Inc.
+;; Written by CodeSourcery, LLC.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+;; 02111-1307, USA.  */
+
+;; These descriptions are based on the information contained in the
+;; ARM926EJ-S Technical Reference Manual, Copyright (c) 2002 ARM
+;; Limited.
+;;
+
+;; This automaton provides a pipeline description for the ARM
+;; 926EJ-S core.
+;;
+;; The model given here assumes that the condition for all conditional
+;; instructions is "true", i.e., that all of the instructions are
+;; actually executed.
+
+(define_automaton "arm926ejs")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Pipelines
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; There is a single pipeline
+;;
+;;   The ALU pipeline has fetch, decode, execute, memory, and
+;;   write stages. We only need to model the execute, memory and write
+;;   stages.
+
+(define_cpu_unit "e,m,w" "arm926ejs")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; ALU Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; ALU instructions require three cycles to execute, and use the ALU
+;; pipeline in each of the three stages.  The results are available
+;; after the execute stage stage has finished.
+;;
+;; If the destination register is the PC, the pipelines are stalled
+;; for several cycles.  That case is not modeled here.
+
+;; ALU operations with no shifted operand
+(define_insn_reservation "9_alu_op" 1 
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "alu,alu_shift"))
+ "e,m,w")
+
+;; ALU operations with a shift-by-register operand
+;; These really stall in the decoder, in order to read
+;; the shift value in a second cycle. Pretend we take two cycles in
+;; the execute stage.
+(define_insn_reservation "9_alu_shift_reg_op" 2 
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "alu_shift_reg"))
+ "e*2,m,w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Multiplication Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; Multiplication instructions loop in the execute stage until the
+;; instruction has been passed through the multiplier array enough
+;; times. Multiply operations occur in both the execute and memory
+;; stages of the pipeline
+
+(define_insn_reservation "9_mult1" 3
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "smlalxy,mul,mla"))
+ "e*2,m,w")
+
+(define_insn_reservation "9_mult2" 4
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "muls,mlas"))
+ "e*3,m,w")
+
+(define_insn_reservation "9_mult3" 4
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "umull,umlal,smull,smlal"))
+ "e*3,m,w")
+
+(define_insn_reservation "9_mult4" 5
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "umulls,umlals,smulls,smlals"))
+ "e*4,m,w")
+
+(define_insn_reservation "9_mult5" 2
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "smulxy,smlaxy,smlawx"))
+ "e,m,w")
+
+(define_insn_reservation "9_mult6" 3
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "insn" "smlalxy"))
+ "e*2,m,w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Load/Store Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; The models for load/store instructions do not accurately describe
+;; the difference between operations with a base register writeback
+;; (such as "ldm!").  These models assume that all memory references
+;; hit in dcache.
+
+;; Loads with a shifted offset take 3 cycles, and are (a) probably the
+;; most common and (b) the pessimistic assumption will lead to fewer stalls.
+(define_insn_reservation "9_load1_op" 3
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "load1,load_byte"))
+ "e*2,m,w")
+
+(define_insn_reservation "9_store1_op" 0
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "store1"))
+ "e,m,w")
+
+;; multiple word loads and stores
+(define_insn_reservation "9_load2_op" 3
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "load2"))
+ "e,m*2,w")
+
+(define_insn_reservation "9_load3_op" 4
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "load3"))
+ "e,m*3,w")
+
+(define_insn_reservation "9_load4_op" 5
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "load4"))
+ "e,m*4,w")
+
+(define_insn_reservation "9_store2_op" 0
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "store2"))
+ "e,m*2,w")
+
+(define_insn_reservation "9_store3_op" 0
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "store3"))
+ "e,m*3,w")
+
+(define_insn_reservation "9_store4_op" 0
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "store4"))
+ "e,m*4,w")
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Branch and Call Instructions
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; Branch instructions are difficult to model accurately.  The ARM
+;; core can predict most branches.  If the branch is predicted
+;; correctly, and predicted early enough, the branch can be completely
+;; eliminated from the instruction stream.  Some branches can
+;; therefore appear to require zero cycles to execute.  We assume that
+;; all branches are predicted correctly, and that the latency is
+;; therefore the minimum value.
+
+(define_insn_reservation "9_branch_op" 0
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "branch"))
+ "nothing")
+
+;; The latency for a call is not predictable.  Therefore, we use 32 as
+;; roughly equivalent to postive infinity.
+
+(define_insn_reservation "9_call_op" 32
+ (and (eq_attr "tune" "arm926ejs")
+      (eq_attr "type" "call"))
+ "nothing")
diff --git a/gcc/config/arm/cirrus.md b/gcc/config/arm/cirrus.md
index 0da8469ddd2..9bd01be45cb 100644
--- a/gcc/config/arm/cirrus.md
+++ b/gcc/config/arm/cirrus.md
@@ -34,7 +34,7 @@
   [(set (match_operand:DI          0 "cirrus_fp_register" "=v")
 	(plus:DI (match_operand:DI 1 "cirrus_fp_register"  "v")
 		 (match_operand:DI 2 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfadd64%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -44,7 +44,7 @@
   [(set (match_operand:SI          0 "cirrus_fp_register" "=v")
 	(plus:SI (match_operand:SI 1 "cirrus_fp_register" "v")
 		 (match_operand:SI 2 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfadd32%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -54,7 +54,7 @@
   [(set (match_operand:SF          0 "cirrus_fp_register" "=v")
 	(plus:SF (match_operand:SF 1 "cirrus_fp_register" "v")
 		 (match_operand:SF 2 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfadds%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -64,7 +64,7 @@
   [(set (match_operand:DF          0 "cirrus_fp_register" "=v")
 	(plus:DF (match_operand:DF 1 "cirrus_fp_register" "v")
 		 (match_operand:DF 2 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfaddd%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -74,7 +74,7 @@
   [(set (match_operand:DI           0 "cirrus_fp_register" "=v")
 	(minus:DI (match_operand:DI 1 "cirrus_fp_register"  "v")
 		  (match_operand:DI 2 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfsub64%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -84,7 +84,7 @@
   [(set (match_operand:SI           0 "cirrus_fp_register" "=v")
 	(minus:SI (match_operand:SI 1 "cirrus_fp_register" "v")
 		  (match_operand:SI 2 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfsub32%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -94,7 +94,7 @@
   [(set (match_operand:SF           0 "cirrus_fp_register" "=v")
 	(minus:SF (match_operand:SF 1 "cirrus_fp_register"  "v")
 		  (match_operand:SF 2 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfsubs%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -104,7 +104,7 @@
   [(set (match_operand:DF           0 "cirrus_fp_register" "=v")
 	(minus:DF (match_operand:DF 1 "cirrus_fp_register" "v")
 		  (match_operand:DF 2 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfsubd%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -114,7 +114,7 @@
   [(set (match_operand:SI          0 "cirrus_fp_register" "=v")
 	(mult:SI (match_operand:SI 2 "cirrus_fp_register"  "v")
 		 (match_operand:SI 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfmul32%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -124,7 +124,7 @@
   [(set (match_operand:DI          0 "cirrus_fp_register" "=v")
 	(mult:DI (match_operand:DI 2 "cirrus_fp_register"  "v")
 		 (match_operand:DI 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmul64%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_dmult")
    (set_attr "cirrus" "normal")]
@@ -136,7 +136,7 @@
 	  (mult:SI (match_operand:SI 1 "cirrus_fp_register"  "v")
 		   (match_operand:SI 2 "cirrus_fp_register"  "v"))
 	  (match_operand:SI          3 "cirrus_fp_register"  "0")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfmac32%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -149,7 +149,7 @@
 	  (match_operand:SI          1 "cirrus_fp_register"  "0")
 	  (mult:SI (match_operand:SI 2 "cirrus_fp_register"  "v")
 		   (match_operand:SI 3 "cirrus_fp_register"  "v"))))]
-  "0 && TARGET_ARM && TARGET_CIRRUS"
+  "0 && TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmsc32%?\\t%V0, %V2, %V3"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -159,7 +159,7 @@
   [(set (match_operand:SF          0 "cirrus_fp_register" "=v")
 	(mult:SF (match_operand:SF 1 "cirrus_fp_register"  "v")
 		 (match_operand:SF 2 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmuls%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_farith")
    (set_attr "cirrus" "normal")]
@@ -169,7 +169,7 @@
   [(set (match_operand:DF          0 "cirrus_fp_register" "=v")
 	(mult:DF (match_operand:DF 1 "cirrus_fp_register"  "v")
 		 (match_operand:DF 2 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmuld%?\\t%V0, %V1, %V2"
   [(set_attr "type" "mav_dmult")
    (set_attr "cirrus" "normal")]
@@ -179,7 +179,7 @@
   [(set (match_operand:SI            0 "cirrus_fp_register" "=v")
 	(ashift:SI (match_operand:SI 1 "cirrus_fp_register"  "v")
 		   (match_operand:SI 2 "cirrus_shift_const"  "")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfsh32%?\\t%V0, %V1, #%s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -188,7 +188,7 @@
   [(set (match_operand:SI	       0 "cirrus_fp_register" "=v")
 	(ashiftrt:SI (match_operand:SI 1 "cirrus_fp_register"  "v")
 		     (match_operand:SI 2 "cirrus_shift_const"  "")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfsh32%?\\t%V0, %V1, #-%s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -197,7 +197,7 @@
   [(set (match_operand:SI            0 "cirrus_fp_register" "=v")
 	(ashift:SI (match_operand:SI 1 "cirrus_fp_register"  "v")
 		   (match_operand:SI 2 "register_operand"    "r")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfrshl32%?\\t%V1, %V0, %s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -206,7 +206,7 @@
   [(set (match_operand:DI            0 "cirrus_fp_register" "=v")
 	(ashift:DI (match_operand:DI 1 "cirrus_fp_register"  "v")
 		   (match_operand:SI 2 "register_operand"    "r")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfrshl64%?\\t%V1, %V0, %s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -215,7 +215,7 @@
   [(set (match_operand:DI            0 "cirrus_fp_register" "=v")
 	(ashift:DI (match_operand:DI 1 "cirrus_fp_register"  "v")
 		   (match_operand:SI 2 "cirrus_shift_const"  "")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfsh64%?\\t%V0, %V1, #%s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -224,7 +224,7 @@
   [(set (match_operand:DI            0 "cirrus_fp_register" "=v")
 	(ashiftrt:DI (match_operand:DI 1 "cirrus_fp_register"  "v")
 		     (match_operand:SI 2 "cirrus_shift_const"  "")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfsh64%?\\t%V0, %V1, #-%s2"
   [(set_attr "cirrus" "normal")]
 )
@@ -232,7 +232,7 @@
 (define_insn "*cirrus_absdi2"
   [(set (match_operand:DI         0 "cirrus_fp_register" "=v")
 	(abs:DI (match_operand:DI 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfabs64%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -242,7 +242,7 @@
   [(set (match_operand:DI         0 "cirrus_fp_register" "=v")
 	(neg:DI (match_operand:DI 1 "cirrus_fp_register"  "v")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfneg64%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -250,7 +250,7 @@
 (define_insn "*cirrus_negsi2"
   [(set (match_operand:SI         0 "cirrus_fp_register" "=v")
 	(neg:SI (match_operand:SI 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfneg32%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -258,7 +258,7 @@
 (define_insn "*cirrus_negsf2"
   [(set (match_operand:SF         0 "cirrus_fp_register" "=v")
 	(neg:SF (match_operand:SF 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfnegs%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -266,7 +266,7 @@
 (define_insn "*cirrus_negdf2"
   [(set (match_operand:DF         0 "cirrus_fp_register" "=v")
 	(neg:DF (match_operand:DF 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfnegd%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -276,7 +276,7 @@
   [(set (match_operand:SI         0 "cirrus_fp_register" "=v")
         (abs:SI (match_operand:SI 1 "cirrus_fp_register"  "v")))
    (clobber (reg:CC CC_REGNUM))]
-  "TARGET_ARM && TARGET_CIRRUS && 0"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0"
   "cfabs32%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -284,7 +284,7 @@
 (define_insn "*cirrus_abssf2"
   [(set (match_operand:SF         0 "cirrus_fp_register" "=v")
         (abs:SF (match_operand:SF 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfabss%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -292,7 +292,7 @@
 (define_insn "*cirrus_absdf2"
   [(set (match_operand:DF         0 "cirrus_fp_register" "=v")
         (abs:DF (match_operand:DF 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfabsd%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -302,7 +302,7 @@
   [(set (match_operand:SF           0 "cirrus_fp_register" "=v")
  	(float:SF (match_operand:SI 1 "s_register_operand"  "r")))
    (clobber (match_scratch:DF 2 "=v"))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmv64lr%?\\t%Z2, %1\;cfcvt32s%?\\t%V0, %Y2"
   [(set_attr "length" "8")
    (set_attr "cirrus" "move")]
@@ -312,7 +312,7 @@
   [(set (match_operand:DF           0 "cirrus_fp_register" "=v")
 	(float:DF (match_operand:SI 1 "s_register_operand" "r")))
    (clobber (match_scratch:DF 2 "=v"))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfmv64lr%?\\t%Z2, %1\;cfcvt32d%?\\t%V0, %Y2"
   [(set_attr "length" "8")
    (set_attr "cirrus" "move")]
@@ -321,14 +321,14 @@
 (define_insn "floatdisf2"
   [(set (match_operand:SF           0 "cirrus_fp_register" "=v")
 	(float:SF (match_operand:DI 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcvt64s%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")])
 
 (define_insn "floatdidf2"
   [(set (match_operand:DF 0 "cirrus_fp_register" "=v")
 	(float:DF (match_operand:DI 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcvt64d%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")])
 
@@ -336,7 +336,7 @@
   [(set (match_operand:SI         0 "s_register_operand" "=r")
 	(fix:SI (fix:SF (match_operand:SF 1 "cirrus_fp_register"  "v"))))
    (clobber (match_scratch:DF     2                      "=v"))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cftruncs32%?\\t%Y2, %V1\;cfmvr64l%?\\t%0, %Z2"
   [(set_attr "length" "8")
    (set_attr "cirrus" "normal")]
@@ -346,7 +346,7 @@
   [(set (match_operand:SI         0 "s_register_operand" "=r")
 	(fix:SI (fix:DF (match_operand:DF 1 "cirrus_fp_register"  "v"))))
    (clobber (match_scratch:DF     2                      "=v"))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cftruncd32%?\\t%Y2, %V1\;cfmvr64l%?\\t%0, %Z2"
   [(set_attr "length" "8")]
 )
@@ -355,7 +355,7 @@
   [(set (match_operand:SF  0 "cirrus_fp_register" "=v")
         (float_truncate:SF
          (match_operand:DF 1 "cirrus_fp_register" "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcvtds%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -363,7 +363,7 @@
 (define_insn "*cirrus_extendsfdf2"
   [(set (match_operand:DF                  0 "cirrus_fp_register" "=v")
         (float_extend:DF (match_operand:SF 1 "cirrus_fp_register"  "v")))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "cfcvtsd%?\\t%V0, %V1"
   [(set_attr "cirrus" "normal")]
 )
@@ -371,7 +371,7 @@
 (define_insn "*cirrus_arm_movdi"
   [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,o<>,v,r,v,m,v")
 	(match_operand:DI 1 "di_operand"              "rIK,mi,r,r,v,m,v,v"))]
-  "TARGET_ARM && TARGET_CIRRUS"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK"
   "*
   {
   switch (which_alternative)
@@ -394,7 +394,7 @@
     }
   }"
   [(set_attr "length"         "  8,   8,     8,   8,     8,     4,     4,     4")
-   (set_attr "type"           "  *,load,store2,   *,     *,  load,store2,     *")
+   (set_attr "type"           "  *,load2,store2,   *,     *,  load2,store2,     *")
    (set_attr "pool_range"     "  *,1020,     *,   *,     *,     *,     *,     *")
    (set_attr "neg_pool_range" "  *,1012,     *,   *,     *,     *,     *,     *")
    (set_attr "cirrus"         "not, not,   not,move,normal,double,double,normal")]
@@ -406,7 +406,7 @@
 (define_insn "*cirrus_arm_movsi_insn"
   [(set (match_operand:SI 0 "general_operand" "=r,r,r,m,*v,r,*v,T,*v")
         (match_operand:SI 1 "general_operand" "rI,K,mi,r,r,*v,T,*v,*v"))]
-  "TARGET_ARM && TARGET_CIRRUS && 0
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK && 0
    && (register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
   "@
@@ -419,7 +419,7 @@
    cfldr32%?\\t%V0, %1
    cfstr32%?\\t%V1, %0
    cfsh32%?\\t%V0, %V1, #0"
-  [(set_attr "type"           "*,  *,  load,store1,   *,     *,  load,store1,     *")
+  [(set_attr "type"           "*,  *,  load1,store1,   *,     *,  load1,store1,     *")
    (set_attr "pool_range"     "*,  *,  4096,     *,   *,     *,  1024,     *,     *")
    (set_attr "neg_pool_range" "*,  *,  4084,     *,   *,     *,  1012,     *,     *")
    (set_attr "cirrus"         "not,not, not,   not,move,normal,normal,normal,normal")]
@@ -428,7 +428,7 @@
 (define_insn "*cirrus_movsf_hard_insn"
   [(set (match_operand:SF 0 "nonimmediate_operand" "=v,v,v,r,m,r,r,m")
         (match_operand:SF 1 "general_operand"       "v,m,r,v,v,r,mE,r"))]
-  "TARGET_ARM && TARGET_CIRRUS
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_MAVERICK
    && (GET_CODE (operands[0]) != MEM
        || register_operand (operands[1], SFmode))"
   "@
@@ -441,7 +441,7 @@
    ldr%?\\t%0, %1\\t%@ float
    str%?\\t%1, %0\\t%@ float"
   [(set_attr "length"         "     *,     *,   *,     *,     *,  4,   4,     4")
-   (set_attr "type"           "     *,  load,   *,     *,store1,  *,load,store1")
+   (set_attr "type"           "     *,  load1,   *,     *,store1,  *,load1,store1")
    (set_attr "pool_range"     "     *,     *,   *,     *,     *,  *,4096,     *")
    (set_attr "neg_pool_range" "     *,     *,   *,     *,     *,  *,4084,     *")
    (set_attr "cirrus"         "normal,normal,move,normal,normal,not, not,   not")]
@@ -451,7 +451,7 @@
   [(set (match_operand:DF 0 "nonimmediate_operand" "=r,Q,r,m,r,v,v,v,r,m")
 	(match_operand:DF 1 "general_operand"       "Q,r,r,r,mF,v,m,r,v,v"))]
   "TARGET_ARM
-   && TARGET_CIRRUS
+   && TARGET_HARD_FLOAT && TARGET_MAVERICK
    && (GET_CODE (operands[0]) != MEM
        || register_operand (operands[1], DFmode))"
   "*
@@ -469,7 +469,7 @@
     default: abort ();
     }
   }"
-  [(set_attr "type"           "load,store2,  *,store2,load,     *,  load,   *,     *,store2")
+  [(set_attr "type"           "load1,store2,  *,store2,load1,     *,  load1,   *,     *,store2")
    (set_attr "length"         "   4,     4,  8,     8,   8,     4,     4,   8,     8,     4")
    (set_attr "pool_range"     "   *,     *,  *,     *, 252,     *,     *,   *,     *,     *")
    (set_attr "neg_pool_range" "   *,     *,  *,     *, 244,     *,     *,   *,     *,     *")
diff --git a/gcc/config/arm/elf.h b/gcc/config/arm/elf.h
index 581f7267900..e3f393aa9d9 100644
--- a/gcc/config/arm/elf.h
+++ b/gcc/config/arm/elf.h
@@ -46,7 +46,7 @@
 
 #ifndef SUBTARGET_ASM_FLOAT_SPEC
 #define SUBTARGET_ASM_FLOAT_SPEC "\
-%{mapcs-float:-mfloat} %{msoft-float:-mfpu=softfpa}"
+%{mapcs-float:-mfloat}"
 #endif
 
 #ifndef ASM_SPEC
@@ -58,6 +58,8 @@
 %{mapcs-*:-mapcs-%*} \
 %(subtarget_asm_float_spec) \
 %{mthumb-interwork:-mthumb-interwork} \
+%{msoft-float:-mfloat-abi=soft} %{mhard-float:-mfloat-abi=hard} \
+%{mfloat-abi=*} %{mfpu=*} \
 %(subtarget_extra_asm_spec)"
 #endif
 
diff --git a/gcc/config/arm/fpa.md b/gcc/config/arm/fpa.md
index 3b6efbfbbda..2eb74ee8c8b 100644
--- a/gcc/config/arm/fpa.md
+++ b/gcc/config/arm/fpa.md
@@ -100,8 +100,8 @@
 (define_insn "*addsf3_fpa"
   [(set (match_operand:SF          0 "s_register_operand" "=f,f")
 	(plus:SF (match_operand:SF 1 "s_register_operand" "%f,f")
-		 (match_operand:SF 2 "fpa_add_operand"    "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:SF 2 "arm_float_add_operand"    "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    adf%?s\\t%0, %1, %2
    suf%?s\\t%0, %1, #%N2"
@@ -112,8 +112,8 @@
 (define_insn "*adddf3_fpa"
   [(set (match_operand:DF          0 "s_register_operand" "=f,f")
 	(plus:DF (match_operand:DF 1 "s_register_operand" "%f,f")
-		 (match_operand:DF 2 "fpa_add_operand"    "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:DF 2 "arm_float_add_operand"    "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    adf%?d\\t%0, %1, %2
    suf%?d\\t%0, %1, #%N2"
@@ -125,8 +125,8 @@
   [(set (match_operand:DF           0 "s_register_operand" "=f,f")
 	(plus:DF (float_extend:DF
 		  (match_operand:SF 1 "s_register_operand"  "f,f"))
-		 (match_operand:DF  2 "fpa_add_operand"    "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:DF  2 "arm_float_add_operand"    "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    adf%?d\\t%0, %1, %2
    suf%?d\\t%0, %1, #%N2"
@@ -139,7 +139,7 @@
 	(plus:DF (match_operand:DF  1 "s_register_operand"  "f")
 		 (float_extend:DF
 		  (match_operand:SF 2 "s_register_operand"  "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "adf%?d\\t%0, %1, %2"
   [(set_attr "type" "farith")
    (set_attr "predicable" "yes")]
@@ -151,7 +151,7 @@
 		  (match_operand:SF 1 "s_register_operand" "f"))
 		 (float_extend:DF
 		  (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "adf%?d\\t%0, %1, %2"
   [(set_attr "type" "farith")
    (set_attr "predicable" "yes")]
@@ -159,9 +159,9 @@
 
 (define_insn "*subsf3_fpa"
   [(set (match_operand:SF 0 "s_register_operand" "=f,f")
-	(minus:SF (match_operand:SF 1 "fpa_rhs_operand" "f,G")
-		  (match_operand:SF 2 "fpa_rhs_operand" "fG,f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(minus:SF (match_operand:SF 1 "arm_float_rhs_operand" "f,G")
+		  (match_operand:SF 2 "arm_float_rhs_operand" "fG,f")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    suf%?s\\t%0, %1, %2
    rsf%?s\\t%0, %2, %1"
@@ -170,9 +170,9 @@
 
 (define_insn "*subdf3_fpa"
   [(set (match_operand:DF           0 "s_register_operand" "=f,f")
-	(minus:DF (match_operand:DF 1 "fpa_rhs_operand"     "f,G")
-		  (match_operand:DF 2 "fpa_rhs_operand"    "fG,f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(minus:DF (match_operand:DF 1 "arm_float_rhs_operand"     "f,G")
+		  (match_operand:DF 2 "arm_float_rhs_operand"    "fG,f")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    suf%?d\\t%0, %1, %2
    rsf%?d\\t%0, %2, %1"
@@ -184,8 +184,8 @@
   [(set (match_operand:DF            0 "s_register_operand" "=f")
 	(minus:DF (float_extend:DF
 		   (match_operand:SF 1 "s_register_operand"  "f"))
-		  (match_operand:DF  2 "fpa_rhs_operand"    "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		  (match_operand:DF  2 "arm_float_rhs_operand"    "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "suf%?d\\t%0, %1, %2"
   [(set_attr "type" "farith")
    (set_attr "predicable" "yes")]
@@ -193,10 +193,10 @@
 
 (define_insn "*subdf_df_esfdf_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f,f")
-	(minus:DF (match_operand:DF 1 "fpa_rhs_operand" "f,G")
+	(minus:DF (match_operand:DF 1 "arm_float_rhs_operand" "f,G")
 		  (float_extend:DF
 		   (match_operand:SF 2 "s_register_operand" "f,f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    suf%?d\\t%0, %1, %2
    rsf%?d\\t%0, %2, %1"
@@ -210,7 +210,7 @@
 		   (match_operand:SF 1 "s_register_operand" "f"))
 		  (float_extend:DF
 		   (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "suf%?d\\t%0, %1, %2"
   [(set_attr "type" "farith")
    (set_attr "predicable" "yes")]
@@ -219,8 +219,8 @@
 (define_insn "*mulsf3_fpa"
   [(set (match_operand:SF 0 "s_register_operand" "=f")
 	(mult:SF (match_operand:SF 1 "s_register_operand" "f")
-		 (match_operand:SF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:SF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "fml%?s\\t%0, %1, %2"
   [(set_attr "type" "ffmul")
    (set_attr "predicable" "yes")]
@@ -229,8 +229,8 @@
 (define_insn "*muldf3_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(mult:DF (match_operand:DF 1 "s_register_operand" "f")
-		 (match_operand:DF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:DF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "muf%?d\\t%0, %1, %2"
   [(set_attr "type" "fmul")
    (set_attr "predicable" "yes")]
@@ -240,8 +240,8 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(mult:DF (float_extend:DF
 		  (match_operand:SF 1 "s_register_operand" "f"))
-		 (match_operand:DF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		 (match_operand:DF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "muf%?d\\t%0, %1, %2"
   [(set_attr "type" "fmul")
    (set_attr "predicable" "yes")]
@@ -252,7 +252,7 @@
 	(mult:DF (match_operand:DF 1 "s_register_operand" "f")
 		 (float_extend:DF
 		  (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "muf%?d\\t%0, %1, %2"
   [(set_attr "type" "fmul")
    (set_attr "predicable" "yes")]
@@ -263,7 +263,7 @@
 	(mult:DF
 	 (float_extend:DF (match_operand:SF 1 "s_register_operand" "f"))
 	 (float_extend:DF (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "muf%?d\\t%0, %1, %2"
   [(set_attr "type" "fmul")
    (set_attr "predicable" "yes")]
@@ -273,9 +273,9 @@
 
 (define_insn "*divsf3_fpa"
   [(set (match_operand:SF 0 "s_register_operand" "=f,f")
-	(div:SF (match_operand:SF 1 "fpa_rhs_operand" "f,G")
-		(match_operand:SF 2 "fpa_rhs_operand" "fG,f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(div:SF (match_operand:SF 1 "arm_float_rhs_operand" "f,G")
+		(match_operand:SF 2 "arm_float_rhs_operand" "fG,f")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    fdv%?s\\t%0, %1, %2
    frd%?s\\t%0, %2, %1"
@@ -285,9 +285,9 @@
 
 (define_insn "*divdf3_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f,f")
-	(div:DF (match_operand:DF 1 "fpa_rhs_operand" "f,G")
-		(match_operand:DF 2 "fpa_rhs_operand" "fG,f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	(div:DF (match_operand:DF 1 "arm_float_rhs_operand" "f,G")
+		(match_operand:DF 2 "arm_float_rhs_operand" "fG,f")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    dvf%?d\\t%0, %1, %2
    rdf%?d\\t%0, %2, %1"
@@ -299,8 +299,8 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(div:DF (float_extend:DF
 		 (match_operand:SF 1 "s_register_operand" "f"))
-		(match_operand:DF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:DF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "dvf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -308,10 +308,10 @@
 
 (define_insn "*divdf_df_esfdf_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f")
-	(div:DF (match_operand:DF 1 "fpa_rhs_operand" "fG")
+	(div:DF (match_operand:DF 1 "arm_float_rhs_operand" "fG")
 		(float_extend:DF
 		 (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rdf%?d\\t%0, %2, %1"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -323,7 +323,7 @@
 		 (match_operand:SF 1 "s_register_operand" "f"))
 		(float_extend:DF
 		 (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "dvf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -332,8 +332,8 @@
 (define_insn "*modsf3_fpa"
   [(set (match_operand:SF 0 "s_register_operand" "=f")
 	(mod:SF (match_operand:SF 1 "s_register_operand" "f")
-		(match_operand:SF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:SF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rmf%?s\\t%0, %1, %2"
   [(set_attr "type" "fdivs")
    (set_attr "predicable" "yes")]
@@ -342,8 +342,8 @@
 (define_insn "*moddf3_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(mod:DF (match_operand:DF 1 "s_register_operand" "f")
-		(match_operand:DF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:DF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rmf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -353,8 +353,8 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(mod:DF (float_extend:DF
 		 (match_operand:SF 1 "s_register_operand" "f"))
-		(match_operand:DF 2 "fpa_rhs_operand" "fG")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		(match_operand:DF 2 "arm_float_rhs_operand" "fG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rmf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -365,7 +365,7 @@
 	(mod:DF (match_operand:DF 1 "s_register_operand" "f")
 		(float_extend:DF
 		 (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rmf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -377,7 +377,7 @@
 		 (match_operand:SF 1 "s_register_operand" "f"))
 		(float_extend:DF
 		 (match_operand:SF 2 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "rmf%?d\\t%0, %1, %2"
   [(set_attr "type" "fdivd")
    (set_attr "predicable" "yes")]
@@ -386,7 +386,7 @@
 (define_insn "*negsf2_fpa"
   [(set (match_operand:SF         0 "s_register_operand" "=f")
 	(neg:SF (match_operand:SF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "mnf%?s\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -395,7 +395,7 @@
 (define_insn "*negdf2_fpa"
   [(set (match_operand:DF         0 "s_register_operand" "=f")
 	(neg:DF (match_operand:DF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "mnf%?d\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -405,7 +405,7 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(neg:DF (float_extend:DF
 		 (match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "mnf%?d\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -414,7 +414,7 @@
 (define_insn "*abssf2_fpa"
   [(set (match_operand:SF          0 "s_register_operand" "=f")
 	 (abs:SF (match_operand:SF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "abs%?s\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -423,7 +423,7 @@
 (define_insn "*absdf2_fpa"
   [(set (match_operand:DF         0 "s_register_operand" "=f")
 	(abs:DF (match_operand:DF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "abs%?d\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -433,7 +433,7 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(abs:DF (float_extend:DF
 		 (match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "abs%?d\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -442,7 +442,7 @@
 (define_insn "*sqrtsf2_fpa"
   [(set (match_operand:SF 0 "s_register_operand" "=f")
 	(sqrt:SF (match_operand:SF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "sqt%?s\\t%0, %1"
   [(set_attr "type" "float_em")
    (set_attr "predicable" "yes")]
@@ -451,7 +451,7 @@
 (define_insn "*sqrtdf2_fpa"
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(sqrt:DF (match_operand:DF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "sqt%?d\\t%0, %1"
   [(set_attr "type" "float_em")
    (set_attr "predicable" "yes")]
@@ -461,7 +461,7 @@
   [(set (match_operand:DF 0 "s_register_operand" "=f")
 	(sqrt:DF (float_extend:DF
 		  (match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "sqt%?d\\t%0, %1"
   [(set_attr "type" "float_em")
    (set_attr "predicable" "yes")]
@@ -470,7 +470,7 @@
 (define_insn "*floatsisf2_fpa"
   [(set (match_operand:SF           0 "s_register_operand" "=f")
 	(float:SF (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "flt%?s\\t%0, %1"
   [(set_attr "type" "r_2_f")
    (set_attr "predicable" "yes")]
@@ -479,7 +479,7 @@
 (define_insn "*floatsidf2_fpa"
   [(set (match_operand:DF           0 "s_register_operand" "=f")
 	(float:DF (match_operand:SI 1 "s_register_operand" "r")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "flt%?d\\t%0, %1"
   [(set_attr "type" "r_2_f")
    (set_attr "predicable" "yes")]
@@ -488,7 +488,7 @@
 (define_insn "*fix_truncsfsi2_fpa"
   [(set (match_operand:SI         0 "s_register_operand" "=r")
 	(fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "fix%?z\\t%0, %1"
   [(set_attr "type" "f_2_r")
    (set_attr "predicable" "yes")]
@@ -497,7 +497,7 @@
 (define_insn "*fix_truncdfsi2_fpa"
   [(set (match_operand:SI         0 "s_register_operand" "=r")
 	(fix:SI (fix:DF (match_operand:DF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "fix%?z\\t%0, %1"
   [(set_attr "type" "f_2_r")
    (set_attr "predicable" "yes")]
@@ -507,7 +507,7 @@
   [(set (match_operand:SF 0 "s_register_operand" "=f")
 	(float_truncate:SF
 	 (match_operand:DF 1 "s_register_operand" "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "mvf%?s\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -516,7 +516,7 @@
 (define_insn "*extendsfdf2_fpa"
   [(set (match_operand:DF                  0 "s_register_operand" "=f")
 	(float_extend:DF (match_operand:SF 1 "s_register_operand"  "f")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "mvf%?d\\t%0, %1"
   [(set_attr "type" "ffarith")
    (set_attr "predicable" "yes")]
@@ -526,7 +526,7 @@
   [(set (match_operand:SF 0 "nonimmediate_operand" "=f,f,f, m,f,r,r,r, m")
 	(match_operand:SF 1 "general_operand"      "fG,H,mE,f,r,f,r,mE,r"))]
   "TARGET_ARM
-   && TARGET_HARD_FLOAT
+   && TARGET_HARD_FLOAT && TARGET_FPA
    && (GET_CODE (operands[0]) != MEM
        || register_operand (operands[1], SFmode))"
   "@
@@ -542,7 +542,7 @@
   [(set_attr "length" "4,4,4,4,8,8,4,4,4")
    (set_attr "predicable" "yes")
    (set_attr "type"
-	 "ffarith,ffarith,f_load,f_store,r_mem_f,f_mem_r,*,load,store1")
+	 "ffarith,ffarith,f_load,f_store,r_mem_f,f_mem_r,*,load1,store1")
    (set_attr "pool_range" "*,*,1024,*,*,*,*,4096,*")
    (set_attr "neg_pool_range" "*,*,1012,*,*,*,*,4084,*")]
 )
@@ -553,7 +553,7 @@
 	(match_operand:DF 1 "general_operand"
 						"Q, r,r,r,mF,fG,H,mF,f,r, f"))]
   "TARGET_ARM
-   && TARGET_HARD_FLOAT
+   && TARGET_HARD_FLOAT && TARGET_FPA
    && (GET_CODE (operands[0]) != MEM
        || register_operand (operands[1], DFmode))"
   "*
@@ -576,7 +576,7 @@
   [(set_attr "length" "4,4,8,8,8,4,4,4,4,8,8")
    (set_attr "predicable" "yes")
    (set_attr "type"
-    "load,store2,*,store2,load,ffarith,ffarith,f_load,f_store,r_mem_f,f_mem_r")
+    "load1,store2,*,store2,load1,ffarith,ffarith,f_load,f_store,r_mem_f,f_mem_r")
    (set_attr "pool_range" "*,*,*,*,1020,*,*,1024,*,*,*")
    (set_attr "neg_pool_range" "*,*,*,*,1008,*,*,1008,*,*,*")]
 )
@@ -589,7 +589,7 @@
 (define_insn "*movxf_fpa"
   [(set (match_operand:XF 0 "nonimmediate_operand" "=f,f,f,m,f,r,r")
 	(match_operand:XF 1 "general_operand" "fG,H,m,f,r,f,r"))]
-  "TARGET_ARM && TARGET_HARD_FLOAT && reload_completed"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA && reload_completed"
   "*
   switch (which_alternative)
     {
@@ -613,8 +613,8 @@
 (define_insn "*cmpsf_fpa"
   [(set (reg:CCFP CC_REGNUM)
 	(compare:CCFP (match_operand:SF 0 "s_register_operand" "f,f")
-		      (match_operand:SF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		      (match_operand:SF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?\\t%0, %1
    cnf%?\\t%0, #%N1"
@@ -625,8 +625,8 @@
 (define_insn "*cmpdf_fpa"
   [(set (reg:CCFP CC_REGNUM)
 	(compare:CCFP (match_operand:DF 0 "s_register_operand" "f,f")
-		      (match_operand:DF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		      (match_operand:DF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?\\t%0, %1
    cnf%?\\t%0, #%N1"
@@ -638,8 +638,8 @@
   [(set (reg:CCFP CC_REGNUM)
 	(compare:CCFP (float_extend:DF
 		       (match_operand:SF 0 "s_register_operand" "f,f"))
-		      (match_operand:DF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		      (match_operand:DF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?\\t%0, %1
    cnf%?\\t%0, #%N1"
@@ -652,7 +652,7 @@
 	(compare:CCFP (match_operand:DF 0 "s_register_operand" "f")
 		      (float_extend:DF
 		       (match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "cmf%?\\t%0, %1"
   [(set_attr "conds" "set")
    (set_attr "type" "f_2_r")]
@@ -661,8 +661,8 @@
 (define_insn "*cmpsf_trap_fpa"
   [(set (reg:CCFPE CC_REGNUM)
 	(compare:CCFPE (match_operand:SF 0 "s_register_operand" "f,f")
-		       (match_operand:SF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		       (match_operand:SF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?e\\t%0, %1
    cnf%?e\\t%0, #%N1"
@@ -673,8 +673,8 @@
 (define_insn "*cmpdf_trap_fpa"
   [(set (reg:CCFPE CC_REGNUM)
 	(compare:CCFPE (match_operand:DF 0 "s_register_operand" "f,f")
-		       (match_operand:DF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		       (match_operand:DF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?e\\t%0, %1
    cnf%?e\\t%0, #%N1"
@@ -686,8 +686,8 @@
   [(set (reg:CCFPE CC_REGNUM)
 	(compare:CCFPE (float_extend:DF
 			(match_operand:SF 0 "s_register_operand" "f,f"))
-		       (match_operand:DF 1 "fpa_add_operand" "fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+		       (match_operand:DF 1 "arm_float_add_operand" "fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    cmf%?e\\t%0, %1
    cnf%?e\\t%0, #%N1"
@@ -700,7 +700,7 @@
 	(compare:CCFPE (match_operand:DF 0 "s_register_operand" "f")
 		       (float_extend:DF
 			(match_operand:SF 1 "s_register_operand" "f"))))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "cmf%?e\\t%0, %1"
   [(set_attr "conds" "set")
    (set_attr "type" "f_2_r")]
@@ -711,9 +711,9 @@
 	(if_then_else:SF
 	 (match_operator 3 "arm_comparison_operator" 
 	  [(match_operand 4 "cc_register" "") (const_int 0)])
-	 (match_operand:SF 1 "fpa_add_operand" "0,0,fG,H,fG,fG,H,H")
-	 (match_operand:SF 2 "fpa_add_operand" "fG,H,0,0,fG,H,fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	 (match_operand:SF 1 "arm_float_add_operand" "0,0,fG,H,fG,fG,H,H")
+	 (match_operand:SF 2 "arm_float_add_operand" "fG,H,0,0,fG,H,fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    mvf%D3s\\t%0, %2
    mnf%D3s\\t%0, #%N2
@@ -733,9 +733,9 @@
 	(if_then_else:DF
 	 (match_operator 3 "arm_comparison_operator"
 	  [(match_operand 4 "cc_register" "") (const_int 0)])
-	 (match_operand:DF 1 "fpa_add_operand" "0,0,fG,H,fG,fG,H,H")
-	 (match_operand:DF 2 "fpa_add_operand" "fG,H,0,0,fG,H,fG,H")))]
-  "TARGET_ARM && TARGET_HARD_FLOAT"
+	 (match_operand:DF 1 "arm_float_add_operand" "0,0,fG,H,fG,fG,H,H")
+	 (match_operand:DF 2 "arm_float_add_operand" "fG,H,0,0,fG,H,fG,H")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_FPA"
   "@
    mvf%D3d\\t%0, %2
    mnf%D3d\\t%0, #%N2
diff --git a/gcc/config/arm/iwmmxt.md b/gcc/config/arm/iwmmxt.md
index f8070a88594..6c87500093e 100644
--- a/gcc/config/arm/iwmmxt.md
+++ b/gcc/config/arm/iwmmxt.md
@@ -86,7 +86,7 @@
     }
 }"
   [(set_attr "length"         "8,8,8,4,4,4,4,4")
-   (set_attr "type"           "*,load,store2,*,*,*,*,*")
+   (set_attr "type"           "*,load1,store2,*,*,*,*,*")
    (set_attr "pool_range"     "*,1020,*,*,*,*,*,*")
    (set_attr "neg_pool_range" "*,1012,*,*,*,*,*,*")]
 )
@@ -110,7 +110,7 @@
    case 7: return \"wstrw\\t%1, %0\";
    default:return \"wstrw\\t%1, [sp, #-4]!\;wldrw\\t%0, [sp], #4\\t@move CG reg\";
   }"
-  [(set_attr "type"           "*,*,load,store1,*,*,load,store1,*")
+  [(set_attr "type"           "*,*,load1,store1,*,*,load1,store1,*")
    (set_attr "length"         "*,*,*,        *,*,*,  16,     *,8")
    (set_attr "pool_range"     "*,*,4096,     *,*,*,1024,     *,*")
    (set_attr "neg_pool_range" "*,*,4084,     *,*,*,   *,  1012,*")
@@ -148,7 +148,7 @@
    case 4: return \"tmcr%?\\t%0, %1\";
    default: return \"tmrc%?\\t%0, %1\";
   }"
-  [(set_attr "type"           "*,*,load,store1,*,*")
+  [(set_attr "type"           "*,*,load1,store1,*,*")
    (set_attr "pool_range"     "*,*,4096,     *,*,*")
    (set_attr "neg_pool_range" "*,*,4084,     *,*,*")]
 )
@@ -169,7 +169,7 @@
    }"
   [(set_attr "predicable" "yes")
    (set_attr "length"         "4,     4,   4,4,4,   8")
-   (set_attr "type"           "*,store1,load,*,*,load")
+   (set_attr "type"           "*,store1,load1,*,*,load1")
    (set_attr "pool_range"     "*,     *, 256,*,*, 256")
    (set_attr "neg_pool_range" "*,     *, 244,*,*, 244")])
 
@@ -189,7 +189,7 @@
    }"
   [(set_attr "predicable" "yes")
    (set_attr "length"         "4,     4,   4,4,4,   8")
-   (set_attr "type"           "*,store1,load,*,*,load")
+   (set_attr "type"           "*,store1,load1,*,*,load1")
    (set_attr "pool_range"     "*,     *, 256,*,*, 256")
    (set_attr "neg_pool_range" "*,     *, 244,*,*, 244")])
 
@@ -209,7 +209,7 @@
    }"
   [(set_attr "predicable" "yes")
    (set_attr "length"         "4,     4,   4,4,4,  24")
-   (set_attr "type"           "*,store1,load,*,*,load")
+   (set_attr "type"           "*,store1,load1,*,*,load1")
    (set_attr "pool_range"     "*,     *, 256,*,*, 256")
    (set_attr "neg_pool_range" "*,     *, 244,*,*, 244")])
 
@@ -225,7 +225,7 @@
   "* return output_move_double (operands);"
   [(set_attr "predicable"     "yes")
    (set_attr "length"         "8")
-   (set_attr "type"           "load")
+   (set_attr "type"           "load1")
    (set_attr "pool_range"     "256")
    (set_attr "neg_pool_range" "244")])
 
@@ -1149,7 +1149,7 @@
   "wsrawg%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")])
 
-(define_insn "ashrdi3"
+(define_insn "ashrdi3_iwmmxt"
   [(set (match_operand:DI              0 "register_operand" "=y")
 	(ashiftrt:DI (match_operand:DI 1 "register_operand" "y")
 		   (match_operand:SI   2 "register_operand" "z")))]
@@ -1173,7 +1173,7 @@
   "wsrlwg%?\\t%0, %1, %2"
   [(set_attr "predicable" "yes")])
 
-(define_insn "lshrdi3"
+(define_insn "lshrdi3_iwmmxt"
   [(set (match_operand:DI              0 "register_operand" "=y")
 	(lshiftrt:DI (match_operand:DI 1 "register_operand" "y")
 		     (match_operand:SI 2 "register_operand" "z")))]
diff --git a/gcc/config/arm/lib1funcs.asm b/gcc/config/arm/lib1funcs.asm
index e72af6cca51..cdf71151b58 100644
--- a/gcc/config/arm/lib1funcs.asm
+++ b/gcc/config/arm/lib1funcs.asm
@@ -243,6 +243,25 @@ pc		.req	r15
 /* ------------------------------------------------------------------------ */	
 .macro ARM_DIV_BODY dividend, divisor, result, curbit
 
+#if __ARM_ARCH__ >= 5 && ! defined (__OPTIMIZE_SIZE__)
+
+	clz	\curbit, \dividend
+	clz	\result, \divisor
+	sub	\curbit, \result, \curbit
+	rsbs	\curbit, \curbit, #31
+	addne	\curbit, \curbit, \curbit, lsl #1
+	mov	\result, #0
+	addne	pc, pc, \curbit, lsl #2
+	nop
+	.set	shift, 32
+	.rept	32
+	.set	shift, shift - 1
+	cmp	\dividend, \divisor, lsl #shift
+	adc	\result, \result, \result
+	subcs	\dividend, \dividend, \divisor, lsl #shift
+	.endr
+
+#else /* __ARM_ARCH__ < 5 || defined (__OPTIMIZE_SIZE__) */
 #if __ARM_ARCH__ >= 5
 
 	clz	\curbit, \divisor
@@ -253,7 +272,7 @@ pc		.req	r15
 	mov	\curbit, \curbit, lsl \result
 	mov	\result, #0
 	
-#else
+#else /* __ARM_ARCH__ < 5 */
 
 	@ Initially shift the divisor left 3 bits if possible,
 	@ set curbit accordingly.  This allows for curbit to be located
@@ -284,7 +303,7 @@ pc		.req	r15
 
 	mov	\result, #0
 
-#endif
+#endif /* __ARM_ARCH__ < 5 */
 
 	@ Division loop
 1:	cmp	\dividend, \divisor
@@ -304,6 +323,8 @@ pc		.req	r15
 	movne	\divisor,  \divisor, lsr #4
 	bne	1b
 
+#endif /* __ARM_ARCH__ < 5 || defined (__OPTIMIZE_SIZE__) */
+
 .endm
 /* ------------------------------------------------------------------------ */	
 .macro ARM_DIV2_ORDER divisor, order
@@ -338,6 +359,22 @@ pc		.req	r15
 /* ------------------------------------------------------------------------ */
 .macro ARM_MOD_BODY dividend, divisor, order, spare
 
+#if __ARM_ARCH__ >= 5 && ! defined (__OPTIMIZE_SIZE__)
+
+	clz	\order, \divisor
+	clz	\spare, \dividend
+	sub	\order, \order, \spare
+	rsbs	\order, \order, #31
+	addne	pc, pc, \order, lsl #3
+	nop
+	.set	shift, 32
+	.rept	32
+	.set	shift, shift - 1
+	cmp	\dividend, \divisor, lsl #shift
+	subcs	\dividend, \dividend, \divisor, lsl #shift
+	.endr
+
+#else /* __ARM_ARCH__ < 5 || defined (__OPTIMIZE_SIZE__) */
 #if __ARM_ARCH__ >= 5
 
 	clz	\order, \divisor
@@ -345,7 +382,7 @@ pc		.req	r15
 	sub	\order, \order, \spare
 	mov	\divisor, \divisor, lsl \order
 	
-#else
+#else /* __ARM_ARCH__ < 5 */
 
 	mov	\order, #0
 
@@ -367,7 +404,7 @@ pc		.req	r15
 	addlo	\order, \order, #1
 	blo	1b
 
-#endif
+#endif /* __ARM_ARCH__ < 5 */
 
 	@ Perform all needed substractions to keep only the reminder.
 	@ Do comparisons in batch of 4 first.
@@ -404,6 +441,9 @@ pc		.req	r15
 4:	cmp	\dividend, \divisor
 	subhs	\dividend, \dividend, \divisor
 5:
+
+#endif /* __ARM_ARCH__ < 5 || defined (__OPTIMIZE_SIZE__) */
+
 .endm
 /* ------------------------------------------------------------------------ */
 .macro THUMB_DIV_MOD_BODY modulo
diff --git a/gcc/config/arm/linux-elf.h b/gcc/config/arm/linux-elf.h
index 9f291c0b49f..bbfcbeb3360 100644
--- a/gcc/config/arm/linux-elf.h
+++ b/gcc/config/arm/linux-elf.h
@@ -55,7 +55,7 @@
    %{shared:-lc} \
    %{!shared:%{profile:-lc_p}%{!profile:-lc}}"
 
-#define LIBGCC_SPEC "%{msoft-float:-lfloat} -lgcc"
+#define LIBGCC_SPEC "%{msoft-float:-lfloat} %{mfloat-abi=soft*:-lfloat} -lgcc"
 
 /* Provide a STARTFILE_SPEC appropriate for GNU/Linux.  Here we add
    the GNU/Linux magical crtbegin.o file (see crtstuff.c) which
diff --git a/gcc/config/arm/netbsd-elf.h b/gcc/config/arm/netbsd-elf.h
index f0f0f75cfa6..5c6960ef899 100644
--- a/gcc/config/arm/netbsd-elf.h
+++ b/gcc/config/arm/netbsd-elf.h
@@ -57,14 +57,11 @@
 #define SUBTARGET_EXTRA_ASM_SPEC	\
   "-matpcs %{fpic|fpie:-k} %{fPIC|fPIE:-k}"
 
-/* Default floating point model is soft-VFP.
-   FIXME: -mhard-float currently implies FPA.  */
+/* Default to full VFP if -mhard-float is specified.  */
 #undef SUBTARGET_ASM_FLOAT_SPEC
 #define SUBTARGET_ASM_FLOAT_SPEC	\
-  "%{mhard-float:-mfpu=fpa} \
-   %{msoft-float:-mfpu=softvfp} \
-   %{!mhard-float: \
-     %{!msoft-float:-mfpu=softvfp}}"
+  "%{mhard-float:{!mfpu=*:-mfpu=vfp}}   \
+   %{mfloat-abi=hard:{!mfpu=*:-mfpu=vfp}}"
 
 #undef SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS				\
@@ -171,3 +168,7 @@ do									\
     (void) sysarch (0, &s);						\
   }									\
 while (0)
+
+#undef FPUTYPE_DEFAULT
+#define FPUTYPE_DEFAULT FPUTYPE_VFP
+
diff --git a/gcc/config/arm/semi.h b/gcc/config/arm/semi.h
index 8847f8c2369..0c9dadc208f 100644
--- a/gcc/config/arm/semi.h
+++ b/gcc/config/arm/semi.h
@@ -64,7 +64,8 @@
 %{mcpu=*:-mcpu=%*} \
 %{march=*:-march=%*} \
 %{mapcs-float:-mfloat} \
-%{msoft-float:-mfpu=softfpa} \
+%{msoft-float:-mfloat-abi=soft} %{mhard-float:mfloat-abi=hard} \
+%{mfloat-abi=*} %{mfpu=*} \
 %{mthumb-interwork:-mthumb-interwork} \
 %(subtarget_extra_asm_spec)"
 #endif
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
new file mode 100644
index 00000000000..a4b6bfb334d
--- /dev/null
+++ b/gcc/config/arm/vfp.md
@@ -0,0 +1,744 @@
+;; ARM VFP coprocessor Machine Description
+;; Copyright (C) 2003 Free Software Foundation, Inc.
+;; Written by CodeSourcery, LLC.
+;;
+;; This file is part of GCC.
+;;
+;; GCC is free software; you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation; either version 2, or (at your option)
+;; any later version.
+;;
+;; GCC is distributed in the hope that it will be useful, but
+;; WITHOUT ANY WARRANTY; without even the implied warranty of
+;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+;; General Public License for more details.
+;;
+;; You should have received a copy of the GNU General Public License
+;; along with GCC; see the file COPYING.  If not, write to the Free
+;; Software Foundation, 59 Temple Place - Suite 330, Boston, MA
+;; 02111-1307, USA.  */
+
+;; Additional register numbers
+(define_constants
+  [(VFPCC_REGNUM 95)]
+)
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Pipeline description
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+(define_automaton "vfp11")
+
+;; There are 3 pipelines in the VFP11 unit.
+;;
+;; - A 8-stage FMAC pipeline (7 execute + writeback) with forward from
+;;   fourth stage for simple operations.
+;;
+;; - A 5-stage DS pipeline (4 execute + writeback) for divide/sqrt insns.
+;;   These insns also uses first execute stage of FMAC pipeline.
+;;
+;; - A 4-stage LS pipeline (execute + 2 memory + writeback) with forward from
+;;   second memory stage for loads.
+
+;; We do not model Write-After-Read hazards.
+;; We do not do write scheduling with the arm core, so it is only neccessary
+;; to model the first stage of each pieline
+;; ??? Need to model LS pipeline properly for load/store multiple?
+;; We do not model fmstat properly.  This could be done by modeiling pipelines
+;; properly and defining an absence set between a dummy fmstat unit and all
+;; other vfp units.
+
+(define_cpu_unit "fmac" "vfp11")
+
+(define_cpu_unit "ds" "vfp11")
+
+(define_cpu_unit "vfp_ls" "vfp11")
+
+;; The VFP "type" attributes differ from those used in the FPA model.
+;; ffarith	Fast floating point insns, eg. abs, neg, cpy, cmp.
+;; farith	Most arithmetic insns.
+;; fmul		Double preision multiply.
+;; fdivs	Single precision sqrt or division.
+;; fdivd	Double precision sqrt or division.
+;; f_load	Floating point load from memory.
+;; f_store	Floating point store to memory.
+;; f_2_r	Transfer vfp to arm reg.
+;; r_2_f	Transfer arm to vfp reg.
+
+(define_insn_reservation "vfp_ffarith" 4
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "ffarith"))
+ "fmac")
+
+(define_insn_reservation "vfp_farith" 8
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "farith"))
+ "fmac")
+
+(define_insn_reservation "vfp_fmul" 9
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "fmul"))
+ "fmac*2")
+
+(define_insn_reservation "vfp_fdivs" 19
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "fdivs"))
+ "ds*15")
+
+(define_insn_reservation "vfp_fdivd" 33
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "fdivd"))
+ "fmac+ds*29")
+
+;; Moves to/from arm regs also use the load/store pipeline.
+(define_insn_reservation "vfp_fload" 4
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "f_load,r_2_f"))
+ "vfp_ls")
+
+(define_insn_reservation "vfp_fstore" 4
+ (and (eq_attr "fpu" "vfp")
+      (eq_attr "type" "f_load,f_2_r"))
+ "vfp_ls")
+
+
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+;; Insn pattersn
+;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
+
+;; SImode moves
+;; ??? For now do not allow loading constants into vfp regs.  This causes
+;; problems because small sonstants get converted into adds.
+(define_insn "*arm_movsi_vfp"
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=r,r,r ,m,!w,r,!w,!w, U")
+      (match_operand:SI 1 "general_operand"	   "rI,K,mi,r,r,!w,!w,Ui,!w"))]
+  "TARGET_ARM && TARGET_VFP && TARGET_HARD_FLOAT
+   && (   s_register_operand (operands[0], SImode)
+       || s_register_operand (operands[1], SImode))"
+  "@
+  mov%?\\t%0, %1
+  mvn%?\\t%0, #%B1
+  ldr%?\\t%0, %1
+  str%?\\t%1, %0
+  fmsr%?\\t%0, %1\\t%@ int
+  fmrs%?\\t%0, %1\\t%@ int
+  fcpys%?\\t%0, %1\\t%@ int
+  flds%?\\t%0, %1\\t%@ int
+  fsts%?\\t%1, %0\\t%@ int"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "*,*,load1,store1,r_2_f,f_2_r,ffarith,f_load,f_store")
+   (set_attr "pool_range"     "*,*,4096,*,*,*,*,1020,*")
+   (set_attr "neg_pool_range" "*,*,4084,*,*,*,*,1008,*")]
+)
+
+
+;; DImode moves
+
+(define_insn "*arm_movdi_vfp"
+  [(set (match_operand:DI 0 "nonimmediate_di_operand" "=r, r,o<>,w,r,w,w ,U")
+	(match_operand:DI 1 "di_operand"              "rIK,mi,r ,r,w,w,Ui,w"))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "*
+  switch (which_alternative)
+    {
+    case 0: case 1: case 2:
+      return (output_move_double (operands));
+    case 3:
+      return \"fmdrr%?\\t%P0, %1\\t%@ int\";
+    case 4:
+      return \"fmrrd%?\\t%0, %1\\t%@ int\";
+    case 5:
+      return \"fcpyd%?\\t%P0, %P1\\t%@ int\";
+    case 6:
+      return \"fldd%?\\t%P0, %1\\t%@ int\";
+    case 7:
+      return \"fstd%?\\t%P1, %0\\t%@ int\";
+    default:
+      abort ();
+    }
+  "
+  [(set_attr "type" "*,load2,store2,r_2_f,f_2_r,ffarith,f_load,f_store")
+   (set_attr "length" "8,8,8,4,4,4,4,4")
+   (set_attr "pool_range"     "*,1020,*,*,*,*,1020,*")
+   (set_attr "neg_pool_range" "*,1008,*,*,*,*,1008,*")]
+)
+
+
+;; SFmode moves
+
+(define_insn "*movsf_vfp"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=w,r,w ,U,r ,m,w,r")
+	(match_operand:SF 1 "general_operand"	   " r,w,UE,w,mE,r,w,r"))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
+   && (   s_register_operand (operands[0], SFmode)
+       || s_register_operand (operands[1], SFmode))"
+  "@
+  fmsr%?\\t%0, %1
+  fmrs%?\\t%0, %1
+  flds%?\\t%0, %1
+  fsts%?\\t%1, %0
+  ldr%?\\t%0, %1\\t%@ float
+  str%?\\t%1, %0\\t%@ float
+  fcpys%?\\t%0, %1
+  mov%?\\t%0, %1\\t%@ float"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "r_2_f,f_2_r,ffarith,*,f_load,f_store,load1,store1")
+   (set_attr "pool_range" "*,*,1020,*,4096,*,*,*")
+   (set_attr "neg_pool_range" "*,*,1008,*,4080,*,*,*")]
+)
+
+
+;; DFmode moves
+
+(define_insn "*movdf_vfp"
+  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,r,r, m,w ,U,w,r")
+	(match_operand:DF 1 "soft_df_operand"		   " r,w,mF,r,UF,w,w,r"))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "*
+  {
+    switch (which_alternative)
+      {
+      case 0:
+	return \"fmdrr%?\\t%P0, %Q1, %R1\";
+      case 1:
+	return \"fmrrd%?\\t%Q0, %R0, %P1\";
+      case 2: case 3: case 7:
+	return output_move_double (operands);
+      case 4:
+	return \"fldd%?\\t%P0, %1\";
+      case 5:
+	return \"fstd%?\\t%P1, %0\";
+      case 6:
+	return \"fcpyd%?\\t%P0, %P1\";
+      default:
+	abort ();
+      }
+    }
+  "
+  [(set_attr "type" "r_2_f,f_2_r,ffarith,*,load2,store2,f_load,f_store")
+   (set_attr "length" "4,4,8,8,4,4,4,8")
+   (set_attr "pool_range" "*,*,1020,*,1020,*,*,*")
+   (set_attr "neg_pool_range" "*,*,1008,*,1008,*,*,*")]
+)
+
+
+;; Conditional move patterns
+
+(define_insn "*movsfcc_vfp"
+  [(set (match_operand:SF   0 "s_register_operand" "=w,w,w,w,w,w,?r,?r,?r")
+	(if_then_else:SF
+	  (match_operator   3 "arm_comparison_operator"
+	    [(match_operand 4 "cc_register" "") (const_int 0)])
+	  (match_operand:SF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
+	  (match_operand:SF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcpys%D3\\t%0, %2
+   fcpys%d3\\t%0, %1
+   fcpys%D3\\t%0, %2\;fcpys%d3\\t%0, %1
+   fmsr%D3\\t%0, %2
+   fmsr%d3\\t%0, %1
+   fmsr%D3\\t%0, %2\;fmsr%d3\\t%0, %1
+   fmrs%D3\\t%0, %2
+   fmrs%d3\\t%0, %1
+   fmrs%D3\\t%0, %2\;fmrs%d3\\t%0, %1"
+   [(set_attr "conds" "use")
+    (set_attr "length" "4,4,8,4,4,8,4,4,8")
+    (set_attr "type" "ffarith,ffarith,ffarith,r_2_f,r_2_f,r_2_f,f_2_r,f_2_r,f_2_r")]
+)
+
+(define_insn "*movdfcc_vfp"
+  [(set (match_operand:DF   0 "s_register_operand" "=w,w,w,w,w,w,?r,?r,?r")
+	(if_then_else:DF
+	  (match_operator   3 "arm_comparison_operator"
+	    [(match_operand 4 "cc_register" "") (const_int 0)])
+	  (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
+	  (match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcpyd%D3\\t%P0, %P2
+   fcpyd%d3\\t%P0, %P1
+   fcpyd%D3\\t%P0, %P2\;fcpyd%d3\\t%P0, %P1
+   fmdrr%D3\\t%P0, %Q2, %R2
+   fmdrr%d3\\t%P0, %Q1, %R1
+   fmdrr%D3\\t%P0, %Q2, %R2\;fmdrr%d3\\t%P0, %Q1, %R1
+   fmrrd%D3\\t%Q0, %R0, %P2
+   fmrrd%d3\\t%Q0, %R0, %P1
+   fmrrd%D3\\t%Q0, %R0, %P2\;fmrrd%d3\\t%Q0, %R0, %P1"
+   [(set_attr "conds" "use")
+    (set_attr "length" "4,4,8,4,4,8,4,4,8")
+    (set_attr "type" "ffarith,ffarith,ffarith,r_2_f,r_2_f,r_2_f,f_2_r,f_2_r,f_2_r")]
+)
+
+
+;; Sign manipulation functions
+
+(define_insn "*abssf2_vfp"
+  [(set (match_operand:SF	  0 "s_register_operand" "=w")
+	(abs:SF (match_operand:SF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fabss%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*absdf2_vfp"
+  [(set (match_operand:DF	  0 "s_register_operand" "=w")
+	(abs:DF (match_operand:DF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fabsd%?\\t%P0, %P1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*negsf2_vfp"
+  [(set (match_operand:SF	  0 "s_register_operand" "+w")
+	(neg:SF (match_operand:SF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnegs%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*negdf2_vfp"
+  [(set (match_operand:DF	  0 "s_register_operand" "+w")
+	(neg:DF (match_operand:DF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnegd%?\\t%P0, %P1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+
+;; Arithmetic insns
+
+(define_insn "*addsf3_vfp"
+  [(set (match_operand:SF	   0 "s_register_operand" "=w")
+	(plus:SF (match_operand:SF 1 "s_register_operand" "w")
+		 (match_operand:SF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fadds%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*adddf3_vfp"
+  [(set (match_operand:DF	   0 "s_register_operand" "=w")
+	(plus:DF (match_operand:DF 1 "s_register_operand" "w")
+		 (match_operand:DF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "faddd%?\\t%P0, %P1, %P2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+
+(define_insn "*subsf3_vfp"
+  [(set (match_operand:SF	    0 "s_register_operand" "=w")
+	(minus:SF (match_operand:SF 1 "s_register_operand" "w")
+		  (match_operand:SF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsubs%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*subdf3_vfp"
+  [(set (match_operand:DF	    0 "s_register_operand" "=w")
+	(minus:DF (match_operand:DF 1 "s_register_operand" "w")
+		  (match_operand:DF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsubd%?\\t%P0, %P1, %P2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+
+;; Division insns
+
+(define_insn "*divsf3_vfp"
+  [(set (match_operand:SF	  0 "s_register_operand" "+w")
+	(div:SF (match_operand:SF 1 "s_register_operand" "w")
+		(match_operand:SF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fdivs%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fdivs")]
+)
+
+(define_insn "*divdf3_vfp"
+  [(set (match_operand:DF	  0 "s_register_operand" "+w")
+	(div:DF (match_operand:DF 1 "s_register_operand" "w")
+		(match_operand:DF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fdivd%?\\t%P0, %P1, %P2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fdivd")]
+)
+
+
+;; Multiplication insns
+
+(define_insn "*mulsf3_vfp"
+  [(set (match_operand:SF	   0 "s_register_operand" "+w")
+	(mult:SF (match_operand:SF 1 "s_register_operand" "w")
+		 (match_operand:SF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmuls%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*muldf3_vfp"
+  [(set (match_operand:DF	   0 "s_register_operand" "+w")
+	(mult:DF (match_operand:DF 1 "s_register_operand" "w")
+		 (match_operand:DF 2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmuld%?\\t%P0, %P1, %P2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+
+(define_insn "*mulsf3negsf_vfp"
+  [(set (match_operand:SF		   0 "s_register_operand" "+w")
+	(mult:SF (neg:SF (match_operand:SF 1 "s_register_operand" "w"))
+		 (match_operand:SF	   2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmuls%?\\t%0, %1, %2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*muldf3negdf_vfp"
+  [(set (match_operand:DF		   0 "s_register_operand" "+w")
+	(mult:DF (neg:DF (match_operand:DF 1 "s_register_operand" "w"))
+		 (match_operand:DF	   2 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmuld%?\\t%P0, %P1, %P2"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+
+;; Multiply-accumulate insns
+
+;; 0 = 1 * 2 + 0
+(define_insn "*mulsf3addsf_vfp"
+  [(set (match_operand:SF		    0 "s_register_operand" "=w")
+	(plus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "w")
+			  (match_operand:SF 3 "s_register_operand" "w"))
+		 (match_operand:SF	    1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmacs%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*muldf3adddf_vfp"
+  [(set (match_operand:DF		    0 "s_register_operand" "=w")
+	(plus:DF (mult:DF (match_operand:DF 2 "s_register_operand" "w")
+			  (match_operand:DF 3 "s_register_operand" "w"))
+		 (match_operand:DF	    1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmacd%?\\t%P0, %P2, %P3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+;; 0 = 1 * 2 - 0
+(define_insn "*mulsf3subsf_vfp"
+  [(set (match_operand:SF		     0 "s_register_operand" "=w")
+	(minus:SF (mult:SF (match_operand:SF 2 "s_register_operand" "w")
+			   (match_operand:SF 3 "s_register_operand" "w"))
+		  (match_operand:SF	     1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmscs%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*muldf3subdf_vfp"
+  [(set (match_operand:DF		     0 "s_register_operand" "=w")
+	(minus:DF (mult:DF (match_operand:DF 2 "s_register_operand" "w")
+			   (match_operand:DF 3 "s_register_operand" "w"))
+		  (match_operand:DF	     1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmscd%?\\t%P0, %P2, %P3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+;; 0 = -(1 * 2) + 0
+(define_insn "*mulsf3negsfaddsf_vfp"
+  [(set (match_operand:SF		     0 "s_register_operand" "=w")
+	(minus:SF (match_operand:SF	     1 "s_register_operand" "0")
+		  (mult:SF (match_operand:SF 2 "s_register_operand" "w")
+			   (match_operand:SF 3 "s_register_operand" "w"))))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmacs%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*fmuldf3negdfadddf_vfp"
+  [(set (match_operand:DF		     0 "s_register_operand" "=w")
+	(minus:DF (match_operand:DF	     1 "s_register_operand" "0")
+		  (mult:DF (match_operand:DF 2 "s_register_operand" "w")
+			   (match_operand:DF 3 "s_register_operand" "w"))))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmacd%?\\t%P0, %P2, %P3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+
+;; 0 = -(1 * 2) - 0
+(define_insn "*mulsf3negsfsubsf_vfp"
+  [(set (match_operand:SF		      0 "s_register_operand" "=w")
+	(minus:SF (mult:SF
+		    (neg:SF (match_operand:SF 2 "s_register_operand" "w"))
+		    (match_operand:SF	      3 "s_register_operand" "w"))
+		  (match_operand:SF	      1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmscs%?\\t%0, %2, %3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*muldf3negdfsubdf_vfp"
+  [(set (match_operand:DF		      0 "s_register_operand" "=w")
+	(minus:DF (mult:DF
+		    (neg:DF (match_operand:DF 2 "s_register_operand" "w"))
+		    (match_operand:DF	      3 "s_register_operand" "w"))
+		  (match_operand:DF	      1 "s_register_operand" "0")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fnmscd%?\\t%P0, %P2, %P3"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fmul")]
+)
+
+
+;; Conversion routines
+
+(define_insn "*extendsfdf2_vfp"
+  [(set (match_operand:DF		   0 "s_register_operand" "=w")
+	(float_extend:DF (match_operand:SF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fcvtds%?\\t%P0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*truncdfsf2_vfp"
+  [(set (match_operand:SF		   0 "s_register_operand" "=w")
+	(float_truncate:SF (match_operand:DF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fcvtsd%?\\t%0, %P1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*truncsisf2_vfp"
+  [(set (match_operand:SI		  0 "s_register_operand" "=w")
+	(fix:SI (fix:SF (match_operand:SF 1 "s_register_operand" "w"))))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "ftosizs%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*truncsidf2_vfp"
+  [(set (match_operand:SI		  0 "s_register_operand" "=w")
+	(fix:SI (fix:DF (match_operand:DF 1 "s_register_operand" "w"))))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "ftosizd%?\\t%0, %P1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*floatsisf2_vfp"
+  [(set (match_operand:SF	    0 "s_register_operand" "=w")
+	(float:SF (match_operand:SI 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsitos%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+(define_insn "*floatsidf2_vfp"
+  [(set (match_operand:DF	    0 "s_register_operand" "=w")
+	(float:DF (match_operand:SI 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsitod%?\\t%P0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "farith")]
+)
+
+
+;; Sqrt insns.
+
+(define_insn "*sqrtsf2_vfp"
+  [(set (match_operand:SF	   0 "s_register_operand" "=w")
+	(sqrt:SF (match_operand:SF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsqrts%?\\t%0, %1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fdivs")]
+)
+
+(define_insn "*sqrtdf2_vfp"
+  [(set (match_operand:DF	   0 "s_register_operand" "=w")
+	(sqrt:DF (match_operand:DF 1 "s_register_operand" "w")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fsqrtd%?\\t%P0, %P1"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "fdivd")]
+)
+
+
+;; Patterns to split/copy vfp condition flags.
+
+(define_insn "*movcc_vfp"
+  [(set (reg CC_REGNUM)
+	(reg VFPCC_REGNUM))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "fmstat%?"
+  [(set_attr "conds" "set")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn_and_split "*cmpsf_split_vfp"
+  [(set (reg:CCFP CC_REGNUM)
+	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "w")
+		      (match_operand:SF 1 "vfp_compare_operand" "wG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "#"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  [(set (reg:CCFP VFPCC_REGNUM)
+	(compare:CCFP (match_dup 0)
+		      (match_dup 1)))
+   (set (reg:CCFP CC_REGNUM)
+	(reg:CCFP VFPCC_REGNUM))]
+  ""
+)
+
+(define_insn_and_split "*cmpsf_trap_split_vfp"
+  [(set (reg:CCFPE CC_REGNUM)
+	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "w")
+		       (match_operand:SF 1 "vfp_compare_operand" "wG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "#"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  [(set (reg:CCFPE VFPCC_REGNUM)
+	(compare:CCFPE (match_dup 0)
+		       (match_dup 1)))
+   (set (reg:CCFPE CC_REGNUM)
+	(reg:CCFPE VFPCC_REGNUM))]
+  ""
+)
+
+(define_insn_and_split "*cmpdf_split_vfp"
+  [(set (reg:CCFP CC_REGNUM)
+	(compare:CCFP (match_operand:DF 0 "s_register_operand"  "w")
+		      (match_operand:DF 1 "vfp_compare_operand" "wG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "#"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  [(set (reg:CCFP VFPCC_REGNUM)
+	(compare:CCFP (match_dup 0)
+		       (match_dup 1)))
+   (set (reg:CCFP CC_REGNUM)
+	(reg:CCFPE VFPCC_REGNUM))]
+  ""
+)
+
+(define_insn_and_split "*cmpdf_trap_split_vfp"
+  [(set (reg:CCFPE CC_REGNUM)
+	(compare:CCFPE (match_operand:DF 0 "s_register_operand"  "w")
+		       (match_operand:DF 1 "vfp_compare_operand" "wG")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "#"
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  [(set (reg:CCFPE VFPCC_REGNUM)
+	(compare:CCFPE (match_dup 0)
+		       (match_dup 1)))
+   (set (reg:CCFPE CC_REGNUM)
+	(reg:CCFPE VFPCC_REGNUM))]
+  ""
+)
+
+
+;; Comparison patterns
+
+(define_insn "*cmpsf_vfp"
+  [(set (reg:CCFP VFPCC_REGNUM)
+	(compare:CCFP (match_operand:SF 0 "s_register_operand"  "w,w")
+		      (match_operand:SF 1 "vfp_compare_operand" "w,G")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcmps%?\\t%0, %1
+   fcmpzs%?\\t%0"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*cmpsf_trap_vfp"
+  [(set (reg:CCFPE VFPCC_REGNUM)
+	(compare:CCFPE (match_operand:SF 0 "s_register_operand"  "w,w")
+		       (match_operand:SF 1 "vfp_compare_operand" "w,G")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcmpes%?\\t%0, %1
+   fcmpezs%?\\t%0"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*cmpdf_vfp"
+  [(set (reg:CCFP VFPCC_REGNUM)
+	(compare:CCFP (match_operand:DF 0 "s_register_operand"  "w,w")
+		      (match_operand:DF 1 "vfp_compare_operand" "w,G")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcmpd%?\\t%P0, %P1
+   fcmpzd%?\\t%P0"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+(define_insn "*cmpdf_trap_vfp"
+  [(set (reg:CCFPE VFPCC_REGNUM)
+	(compare:CCFPE (match_operand:DF 0 "s_register_operand"  "w,w")
+		       (match_operand:DF 1 "vfp_compare_operand" "w,G")))]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "@
+   fcmped%?\\t%P0, %P1
+   fcmpezd%?\\t%P0"
+  [(set_attr "predicable" "yes")
+   (set_attr "type" "ffarith")]
+)
+
+
+;; Store multiple insn used in function prologue.
+
+(define_insn "*push_multi_vfp"
+  [(match_parallel 2 "multi_register_push"
+    [(set (match_operand:BLK 0 "memory_operand" "=m")
+	  (unspec:BLK [(match_operand:DF 1 "s_register_operand" "w")]
+		      UNSPEC_PUSH_MULT))])]
+  "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP"
+  "* return vfp_output_fstmx (operands);"
+  [(set_attr "type" "f_store")]
+)
+
+
+;; Unimplemented insns:
+;; fldm*
+;; fstm*
+;; fmdhr et al (VFPv1)
+;; Support for xD (single precisio only) variants.
+;; fmrrs, fmsrr
+;; fuito*
+;; ftoui*
diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 13885074e9e..8379fe773bb 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -926,12 +926,13 @@ and SPARC@.
 @itemx --with-arch=@var{cpu}
 @itemx --with-tune=@var{cpu}
 @itemx --with-abi=@var{abi}
+@itemx --with-fpu=@var{type}
 @itemx --with-float=@var{type}
 These configure options provide default values for the @option{-mschedule=},
-@option{-march=}, @option{-mtune=}, and @option{-mabi=} options and for
-@option{-mhard-float} or @option{-msoft-float}.  As with @option{--with-cpu},
-which switches will be accepted and acceptable values of the arguments depend
-on the target.
+@option{-march=}, @option{-mtune=}, @option{-mabi=}, and @option{-mfpu=}
+options and for @option{-mhard-float} or @option{-msoft-float}.  As with
+@option{--with-cpu}, which switches will be accepted and acceptable values
+of the arguments depend on the target.
 
 @item --enable-altivec
 Specify that the target supports AltiVec vector enhancements.  This
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6ed5e0840e2..5e7897c6c11 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -375,9 +375,9 @@ in the following sections.
 -msched-prolog  -mno-sched-prolog @gol
 -mlittle-endian  -mbig-endian  -mwords-little-endian @gol
 -malignment-traps  -mno-alignment-traps @gol
--msoft-float  -mhard-float  -mfpe @gol
+-mfloat-abi=@var{name}  soft-float  -mhard-float  -mfpe @gol
 -mthumb-interwork  -mno-thumb-interwork @gol
--mcpu=@var{name}  -march=@var{name}  -mfpe=@var{name}  @gol
+-mcpu=@var{name}  -march=@var{name}  -mfpu=@var{name}  @gol
 -mstructure-size-boundary=@var{n} @gol
 -mabort-on-noreturn @gol
 -mlong-calls  -mno-long-calls @gol
@@ -6497,6 +6497,16 @@ this option.  In particular, you need to compile @file{libgcc.a}, the
 library that comes with GCC, with @option{-msoft-float} in order for
 this to work.
 
+@item -mfloat-abi=@var{name}
+@opindex mfloat-abi
+Specifies which ABI to use for floating point values.  Permissible values
+are: @samp{soft}, @samp{softfp} and @samp{hard}.
+
+@samp{soft} and @samp{hard} are equivalent to @option{-msoft-float}
+and @option{-mhard-float} respectively.  @samp{softfp} allows the generation
+of floating point instructions, but still uses the soft-float calling
+conventions.
+
 @item -mlittle-endian
 @opindex mlittle-endian
 Generate code for a processor running in little-endian mode.  This is
@@ -6595,16 +6605,23 @@ name to determine what kind of instructions it can emit when generating
 assembly code.  This option can be used in conjunction with or instead
 of the @option{-mcpu=} option.  Permissible names are: @samp{armv2},
 @samp{armv2a}, @samp{armv3}, @samp{armv3m}, @samp{armv4}, @samp{armv4t},
-@samp{armv5}, @samp{armv5t}, @samp{armv5te}, @samp{armv6j},
+@samp{armv5}, @samp{armv5t}, @samp{armv5te}, @samp{armv6}, @samp{armv6j},
 @samp{iwmmxt}, @samp{ep9312}.
 
-@item -mfpe=@var{number}
+@item -mfpu=@var{name}
+@itemx -mfpe=@var{number}
 @itemx -mfp=@var{number}
+@opindex mfpu
 @opindex mfpe
 @opindex mfp
-This specifies the version of the floating point emulation available on
-the target.  Permissible values are 2 and 3.  @option{-mfp=} is a synonym
-for @option{-mfpe=}, for compatibility with older versions of GCC@.
+This specifies what floating point hardware (or hardware emulation) is
+available on the target.  Permissible names are: @samp{fpa}, @samp{fpe2},
+@samp{fpe3}, @samp{maverick}, @samp{vfp}.  @option{-mfp} and @option{-mfpe}
+are synonyms for @option{-mpfu}=@samp{fpe}@var{number}, for compatibility
+with older versions of GCC@.
+
+If @option{-msoft-float} is specified this specifies the format of
+floating point values.
 
 @item -mstructure-size-boundary=@var{n}
 @opindex mstructure-size-boundary
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index a77e8caf183..131280cae19 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1340,6 +1340,9 @@ available on some particular machines.
 @item f
 Floating-point register
 
+@item w
+VFP floating-point register
+
 @item F
 One of the floating-point constants 0.0, 0.5, 1.0, 2.0, 3.0, 4.0, 5.0
 or 10.0
@@ -1376,6 +1379,9 @@ An item in the constant pool
 A symbol in the text segment of the current file
 @end table
 
+@item U
+A memory reference suitable for VFP load/store insns (reg+constant offset)
+
 @item AVR family---@file{avr.h}
 @table @code
 @item l
diff --git a/gcc/longlong.h b/gcc/longlong.h
index 5c128163590..03150201103 100644
--- a/gcc/longlong.h
+++ b/gcc/longlong.h
@@ -186,7 +186,7 @@ do {									\
 UDItype __umulsidi3 (USItype, USItype);
 #endif
 
-#if defined (__arm__) && W_TYPE_SIZE == 32
+#if defined (__arm__) && !defined (__thumb__) && W_TYPE_SIZE == 32
 #define add_ssaaaa(sh, sl, ah, al, bh, bl) \
   __asm__ ("adds	%1, %4, %5\n\tadc	%0, %2, %3"		\
 	   : "=r" ((USItype) (sh)),					\
-- 
2.11.4.GIT